bib_card 0.5.0

Sign up to get free protection for your applications and to get access to all the features.
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA256:
3
+ metadata.gz: 4ae3e321e881cca41e3c901fdaa406fdad13bb4942e1e5fc5aba0ed9978aa510
4
+ data.tar.gz: 93fe803b13025531d593fb20ad0e8df175c31f9286fa87505696e48ca49d33e4
5
+ SHA512:
6
+ metadata.gz: 228e39fc5badb23ad83bffbcb0ddec432f245b7ba92f0fea584cc29e1edb1c6b677a399597acda26f99d70a99576d081634df0b09a34972d96cd13a298f2adde
7
+ data.tar.gz: 38651eac526d014b72f914cbf457469125a5229a1a6ad75e3d1caf7bc8bae91721fbfc5de9bff27e83de7d7bf74dcacf4a53b3357a09d9f8d6e219dbc6ee9bd9
@@ -0,0 +1,10 @@
1
+ /.bundle/
2
+ /.yardoc
3
+ /Gemfile.lock
4
+ /_yardoc/
5
+ /coverage/
6
+ /doc/
7
+ /pkg/
8
+ /spec/reports/
9
+ /tmp/
10
+ *.gem
data/.rspec ADDED
@@ -0,0 +1,3 @@
1
+ --require spec_helper
2
+ --format documentation
3
+ --color
@@ -0,0 +1,4 @@
1
+ language: ruby
2
+ rvm:
3
+ - 2.2.2
4
+ before_install: gem install bundler -v 1.10.6
@@ -0,0 +1,13 @@
1
+ # Contributor Code of Conduct
2
+
3
+ As contributors and maintainers of this project, we pledge to respect all people who contribute through reporting issues, posting feature requests, updating documentation, submitting pull requests or patches, and other activities.
4
+
5
+ We are committed to making participation in this project a harassment-free experience for everyone, regardless of level of experience, gender, gender identity and expression, sexual orientation, disability, personal appearance, body size, race, ethnicity, age, or religion.
6
+
7
+ Examples of unacceptable behavior by participants include the use of sexual language or imagery, derogatory comments or personal attacks, trolling, public or private harassment, insults, or other unprofessional conduct.
8
+
9
+ Project maintainers have the right and responsibility to remove, edit, or reject comments, commits, code, wiki edits, issues, and other contributions that are not aligned to this Code of Conduct. Project maintainers who do not follow the Code of Conduct may be removed from the project team.
10
+
11
+ Instances of abusive, harassing, or otherwise unacceptable behavior may be reported by opening an issue or contacting one or more of the project maintainers.
12
+
13
+ This Code of Conduct is adapted from the [Contributor Covenant](http://contributor-covenant.org), version 1.0.0, available at [http://contributor-covenant.org/version/1/0/0/](http://contributor-covenant.org/version/1/0/0/)
data/Gemfile ADDED
@@ -0,0 +1,4 @@
1
+ source 'https://rubygems.org'
2
+
3
+ # Specify your gem's dependencies in bib_card.gemspec
4
+ gemspec
@@ -0,0 +1,21 @@
1
+ The MIT License (MIT)
2
+
3
+ Copyright (c) 2016 University of Wisconsin Board of Regents
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in
13
+ all copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
21
+ THE SOFTWARE.
@@ -0,0 +1,88 @@
1
+ # BibCard
2
+
3
+ BibCard is a Ruby library for retrieving and assembling knowledge card information about the authors found in bibliographic data. It takes identifiers like Library of Congress Name Authority File (LCNAF) or VIAF URIs as input and crawls Linked Open Data sources on the web to assemble a Ruby objects or RDF serializations. This library will fetch data from:
4
+
5
+ * [Virtual International Authority File (VIAF)](http://viaf.org/)
6
+ * [Wikidata](https://www.wikidata.org/wiki/Wikidata:Main_Page)
7
+ * [DBpedia](http://wiki.dbpedia.org/)
8
+ * [Getty Vocabularies LOD](http://vocab.getty.edu/)
9
+
10
+ The VIAF URI lies at the core of the `BibCard::Person` object because it acts as a hub to many other data sources on the Web. With the VIAF data in hand the other three sources listed above are "crawled" for more information about a given identity. Technically the data is requested by making one or more HTTP requests to each of the data sources' public SPARQL endpoints.
11
+
12
+ `BibCard` makes extensive use of the [Spira](https://github.com/ruby-rdf/spira) library for RDF-to-object mapping. The result is that after assembling a micrograph of knowledge card data the client can work with simple code objects.
13
+
14
+ ## Installation
15
+
16
+ This gem is not yet in rubygems. Until the tires are kicked a few more times please use this command line install.
17
+
18
+ ```bash
19
+ $ git clone https://github.com/UW-Madison-Library/bibcard
20
+ $ cd bibcard
21
+ $ bundle install
22
+ $ gem build bib_card.gemspec
23
+ $ gem install bib_card-<VERSION-NUMBER>.gem
24
+ ```
25
+
26
+ ## Usage
27
+
28
+ ### Instantiate a `BibCard::Person`
29
+
30
+ Given a Library of Congress Name Authority File or VIAF URI, instantiate a `BibCard::Person` and inspect the data.
31
+
32
+ *Note:* Every call to to `BibCard.person()` will make many calls to the public SPARQL endpoints for the sources cited above.
33
+
34
+ ```ruby
35
+ require 'bib_card'
36
+
37
+ lcnaf_uri = "http://id.loc.gov/authorities/names/n78086005"
38
+ person = BibCard.person(lcnaf_uri)
39
+
40
+ person.name(["en", "en-US"]) # => "Pablo Picasso"
41
+ person.birth_date # => "1881-10-25"
42
+ person.death_date # => "1973-04-09"
43
+
44
+ person.dbpedia_resource # => <BibCard::DBPedia::Resource:70307318111440 @subject: http://dbpedia.org/resource/Pablo_Picasso>
45
+ person.dbpedia_resource.abstract # => "Pablo Ruiz y Picasso, also known as Pablo Picasso (/pɪˈkɑːsoʊ, -ˈkæsoʊ/; Spanish: [ˈpaβlo piˈkaso]; 25 October 1881 – 8 April 1973), was a Spanish painter..."
46
+
47
+ person.getty_subject # => <BibCard::Getty::Subject:70307331508400 @subject: http://vocab.getty.edu/ulan/500009666>
48
+ person.getty_subject.scope_note # => <BibCard::Getty::ScopeNote:70307331409520 @subject: http://vocab.getty.edu/ulan/scopeNote/53649>
49
+ person.getty_subject.scope_note.value # => "Long-lived and very influential Spanish artist, active in France. He dominated 20th-century European art. With Georges Braque, he is credited with inventing Cubism."
50
+ person.getty_subject.scope_note.sources # => [<BibCard::Getty::Source:70307327167300 @subject: http://vocab.getty.edu/ulan/source/2100153925>, <BibCard::Getty::Source:70307327106100 @subject: http://vocab.getty.edu/ulan/source/2100156698>]
51
+ person.getty_subject.scope_note.sources.map {|source| source.short_title} # => ["LCNAF Library of Congress Name Authority File [n.d.]", "Grove Dictionary of Art online (1999-2002)"]
52
+ ```
53
+
54
+ ### Fetch Raw Data for a `BibCard::Person`
55
+
56
+ A BibCard knowledge/info card is generated from many different sources, which is inherently slow. You can also retrieve person data as a serialized string of RDF n-triples. The raw data is available so that it can be cached locally. Once the data is cached you can load a [Spira](https://github.com/ruby-rdf/spira) repository and instantiate a `BibCard::Person` object.
57
+
58
+ ```ruby
59
+ require 'bib_card'
60
+
61
+ lcnaf_uri = "http://id.loc.gov/authorities/names/n78086005"
62
+ data = BibCard.person_data(lcnaf_uri)
63
+ puts data
64
+
65
+ # <http://viaf.org/viaf/15873> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/Person> .
66
+ # <http://viaf.org/viaf/15873> <http://schema.org/deathDate> "1973-04-09" .
67
+ # <http://viaf.org/viaf/15873> <http://schema.org/sameAs> <http://id.loc.gov/authorities/names/n78086005> .
68
+ # ...
69
+
70
+ #### cache the serialized data ####
71
+
72
+ Spira.repository = RDF::Repository.new.from_ntriples(data)
73
+ viaf_uri = Spira.repository.query(predicate: BibCard::SCHEMA_SAME_AS, object: RDF::URI.new(lcnaf_uri)).first.subject
74
+ person = viaf_uri.as(BibCard::Person)
75
+
76
+ person # => <BibCard::Person:70307327106900 @subject: http://viaf.org/viaf/15873>
77
+ person.name(["en", "en-US"]) # => "Pablo Picasso"
78
+ ```
79
+
80
+ ## Contributing
81
+
82
+ Bug reports and pull requests are welcome on GitHub at https://github.com/UW-Madison-Library/bibcard. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the [Contributor Covenant](http://contributor-covenant.org) code of conduct.
83
+
84
+
85
+ ## License
86
+
87
+ The gem is available as open source under the terms of the [MIT License](http://opensource.org/licenses/MIT).
88
+
@@ -0,0 +1,6 @@
1
+ require "bundler/gem_tasks"
2
+ require "rspec/core/rake_task"
3
+
4
+ RSpec::Core::RakeTask.new(:spec)
5
+
6
+ task :default => :spec
@@ -0,0 +1,42 @@
1
+ # coding: utf-8
2
+ lib = File.expand_path('../lib', __FILE__)
3
+ $LOAD_PATH.unshift(lib) unless $LOAD_PATH.include?(lib)
4
+ require 'bib_card/version'
5
+
6
+ Gem::Specification.new do |spec|
7
+ spec.name = "bib_card"
8
+ spec.version = BibCard::VERSION
9
+ spec.authors = ["Steve Meyer"]
10
+ spec.email = ["stephen.meyer@wisc.edu"]
11
+
12
+ spec.summary = %q{Library Linked Data for building knowledge cards.}
13
+ spec.description = %q{Given a URI for a bibliographic author entity, assemble useful information for producing a knowledge card.}
14
+ spec.homepage = "https://github.com/UW-Madison-Library/bibcard.git"
15
+ spec.license = "MIT"
16
+
17
+ # Prevent pushing this gem to RubyGems.org by setting 'allowed_push_host', or
18
+ # delete this section to allow pushing this gem to any host.
19
+ # if spec.respond_to?(:metadata)
20
+ # spec.metadata['allowed_push_host'] = "TODO: Set to 'http://mygemserver.com'"
21
+ # else
22
+ # raise "RubyGems 2.0 or newer is required to protect against public gem pushes."
23
+ # end
24
+
25
+ spec.files = `git ls-files -z`.split("\x0").reject { |f| f.match(%r{^(test|spec|features)/}) }
26
+ spec.bindir = "exe"
27
+ spec.executables = spec.files.grep(%r{^exe/}) { |f| File.basename(f) }
28
+ spec.require_paths = ["lib"]
29
+
30
+ spec.add_runtime_dependency "rdf", "~> 3.0", ">= 3.0.1"
31
+ spec.add_runtime_dependency "rdf-rdfxml", "~> 3.1.0"
32
+ spec.add_runtime_dependency "spira", "~> 3.0"
33
+ spec.add_runtime_dependency "rest-client", '~> 2.0.2'
34
+ spec.add_runtime_dependency "nokogiri", "~> 1.11.1"
35
+ spec.add_runtime_dependency "equivalent-xml", "~> 0.6"
36
+
37
+ spec.add_development_dependency "bundler", "~> 2.2.5"
38
+ spec.add_development_dependency "rake", "~> 13.0.1"
39
+ spec.add_development_dependency "rspec", "~> 3.4"
40
+ spec.add_development_dependency "simplecov", "~> 0.11", ">= 0.11.2"
41
+ spec.add_development_dependency "webmock", "~> 2.0", ">= 2.0.3"
42
+ end
@@ -0,0 +1,14 @@
1
+ #!/usr/bin/env ruby
2
+
3
+ require "bundler/setup"
4
+ require "bib_card"
5
+
6
+ # You can add fixtures and/or initialization code here to make experimenting
7
+ # with your gem easier. You can also use a different console, if you like.
8
+
9
+ # (If you use this, don't forget to add pry to your Gemfile!)
10
+ # require "pry"
11
+ # Pry.start
12
+
13
+ require "irb"
14
+ IRB.start
@@ -0,0 +1,7 @@
1
+ #!/bin/bash
2
+ set -euo pipefail
3
+ IFS=$'\n\t'
4
+
5
+ bundle install
6
+
7
+ # Do any other automated setup that you need to do here
@@ -0,0 +1,112 @@
1
+ require "openssl"
2
+ require "rdf"
3
+ require "rdf/rdfxml"
4
+ require "rdf/xsd"
5
+ require "spira"
6
+ require "rest-client"
7
+ require "json"
8
+
9
+ require "bib_card/version"
10
+ require "bib_card/uris"
11
+ require "bib_card/author"
12
+ require "bib_card/person"
13
+ require "bib_card/crawler"
14
+ require "bib_card/invalid_uri_exception"
15
+ require "bib_card/entity_not_found_exception"
16
+ require "bib_card/crawl_exception"
17
+ require "bib_card/db_pedia/resource"
18
+ require "bib_card/getty/scope_note"
19
+ require "bib_card/getty/source"
20
+ require "bib_card/getty/subject"
21
+ require "bib_card/wikidata/entity"
22
+
23
+ module BibCard
24
+
25
+ class << self
26
+ attr_writer :logger
27
+
28
+ def logger
29
+ @logger ||= Logger.new($stdout).tap do |logger|
30
+ logger.progname = self.name
31
+ logger.formatter = proc do |severity, time, progname, msg|
32
+ "#{severity} [#{time.strftime('%Y-%m-%d %H:%M:%S.%L')}] #{progname}: #{msg}\n"
33
+ end
34
+ end
35
+ end
36
+
37
+ def person_data(uri)
38
+ graph, viaf_uri = creator_graph_and_viaf_uri(uri)
39
+ graph.dump(:ntriples)
40
+ end
41
+
42
+ def person(uri)
43
+ graph, viaf_uri = creator_graph_and_viaf_uri(uri)
44
+ Spira.repository = graph
45
+ viaf_uri.as(Person)
46
+ end
47
+
48
+ def viaf_uri?(uri)
49
+ url = uri.to_s
50
+ url.match(/^http:\/\/viaf\.org\/viaf\/\d+$/).nil? ? false : true
51
+ end
52
+
53
+ def lcnaf_uri?(uri)
54
+ url = uri.to_s
55
+ url.match(/^http:\/\/id\.loc\.gov\/authorities\/names\/n[bors]{0,1}\d+$/).nil? ? false : true
56
+ end
57
+
58
+ private
59
+
60
+ def creator_graph_and_viaf_uri(uri)
61
+ # Convert the URI to an RDF::URI object if it is not already
62
+ uri = convert_uri(uri)
63
+
64
+ # 1. Get the VIAF data and determine the VIAF URI
65
+ begin
66
+ if lcnaf_uri?(uri)
67
+ # Load the VIAF data graph and determine the VIAF URI based on the LCNAF URI.
68
+ identifier = lcnaf_uri_to_identifier(uri)
69
+ viaf_url = "http://viaf.org/viaf/sourceID/" + URI.encode_www_form_component("LC|#{identifier}")
70
+ viaf_graph = RDF::Graph.load(viaf_url, format: :rdfxml)
71
+ viaf_uri = viaf_graph.query({predicate: SCHEMA_SAME_AS, object: uri}).first.subject
72
+ elsif viaf_uri?(uri)
73
+ # Load the VIAF data graph using the URI
74
+ viaf_uri = uri
75
+ viaf_graph = RDF::Graph.load(uri, format: :rdfxml)
76
+ else
77
+ raise BibCard::InvalidURIException
78
+ end
79
+ rescue IOError
80
+ raise BibCard::EntityNotFoundException
81
+ rescue Errno::ECONNRESET
82
+ raise BibCard::CrawlException.new("Unable to access VIAF, connection reset by peer.")
83
+ rescue NoMethodError => e
84
+ undifferentiated_uri_msg = "This VIAF URI has been corrupted by an 'undifferentiate name' and should be treated as unusable."
85
+ results = viaf_graph.query({predicate: RDFS_COMMENT, object: undifferentiated_uri_msg})
86
+ if results.size > 0
87
+ raise BibCard::EntityNotFoundException.new(undifferentiated_uri_msg)
88
+ else
89
+ raise e
90
+ end
91
+ end
92
+
93
+ # 2. Crawl and use it as a basis for crawling the other data sources
94
+ crawler = Crawler.new(viaf_uri, viaf_graph)
95
+ graph = crawler.creator_graph
96
+ [graph, viaf_uri]
97
+ end
98
+
99
+ def lcnaf_uri_to_identifier(uri)
100
+ url = uri.to_s
101
+ url.gsub("http://id.loc.gov/authorities/names/", "")
102
+ end
103
+
104
+ # Convert
105
+ def convert_uri(uri)
106
+ uri.is_a?(RDF::URI) ? uri : RDF::URI.new(uri)
107
+ end
108
+ end
109
+ end
110
+
111
+ # Rails support
112
+ require 'bib_card/railtie' if defined?(Rails)
@@ -0,0 +1,5 @@
1
+ module BibCard
2
+ class Author
3
+
4
+ end
5
+ end
@@ -0,0 +1,9 @@
1
+ module BibCard
2
+ class CrawlException < RuntimeError
3
+
4
+ def initialize(message = "")
5
+ super message
6
+ end
7
+
8
+ end
9
+ end
@@ -0,0 +1,420 @@
1
+ module BibCard
2
+ class Crawler
3
+
4
+ def initialize(uri, repository)
5
+ @subject = RDF::URI(uri)
6
+ @repository = repository
7
+ end
8
+
9
+ SPARQL_ENDPOINTS = {
10
+ getty: "http://vocab.getty.edu/sparql?query=",
11
+ wikidata: "http://query.wikidata.org/sparql?query=",
12
+ dbpedia: "http://dbpedia.org/sparql?query="
13
+ }
14
+
15
+ def birth_date
16
+ stmt = @repository.query({subject: @subject, predicate: SCHEMA_BIRTHDATE}).first
17
+ stmt.nil? ? nil : stmt.object
18
+ end
19
+
20
+ def death_date
21
+ stmt = @repository.query({subject: @subject, predicate: SCHEMA_DEATHDATE}).first
22
+ stmt.nil? ? nil : stmt.object
23
+ end
24
+
25
+ def loc_uri
26
+ stmt = @repository.query({subject: @subject, predicate: SCHEMA_SAME_AS}).select {|s| s.object.to_s.match('http://id.loc.gov/authorities/names/')}.first
27
+ stmt.nil? ? nil : stmt.object
28
+ end
29
+
30
+ def dbpedia_uri
31
+ stmt = @repository.query({subject: @subject, predicate: SCHEMA_SAME_AS}).select {|s| s.object.to_s.match('http://dbpedia.org/resource')}.first
32
+ stmt.nil? ? nil : stmt.object
33
+ end
34
+
35
+ def getty_uri
36
+ stmt = @repository.query({subject: @subject, predicate: SCHEMA_SAME_AS}).select {|s| s.object.to_s.match('vocab.getty.edu')}.first
37
+ stmt.nil? ? nil : RDF::URI.new( stmt.object.to_s.gsub('-agent', '') )
38
+ end
39
+
40
+ def wikidata_uri
41
+ stmt = @repository.query({subject: @subject, predicate: SCHEMA_SAME_AS}).select {|s| s.object.to_s.match('http://www.wikidata.org/entity')}.first
42
+ stmt.nil? ? nil : stmt.object
43
+ end
44
+
45
+ def creator_graph
46
+ graph = RDF::Graph.new
47
+ if @repository.size > 0
48
+ @repository.query({subject: @subject, predicate: RDF.type}).each {|stmt| graph << stmt}
49
+ @repository.query({subject: @subject, predicate: SCHEMA_NAME}).each {|stmt| graph << stmt}
50
+ graph << [@subject, SCHEMA_BIRTHDATE, self.birth_date] if self.birth_date
51
+ graph << [@subject, SCHEMA_DEATHDATE, self.death_date] if self.death_date
52
+ graph << [@subject, SCHEMA_SAME_AS, self.loc_uri] if self.loc_uri
53
+ graph << [@subject, SCHEMA_SAME_AS, self.dbpedia_uri] if self.dbpedia_uri
54
+ graph << [@subject, SCHEMA_SAME_AS, self.getty_uri] if self.getty_uri
55
+ graph << [@subject, SCHEMA_SAME_AS, self.wikidata_uri] if self.wikidata_uri
56
+ graph << dbpedia_graph if self.dbpedia_uri
57
+ graph << getty_note_graph if self.getty_uri
58
+ graph << wikidata_graph if self.wikidata_uri
59
+ end
60
+ graph
61
+ end
62
+
63
+ def dbpedia_graph
64
+ graph = RDF::Graph.new
65
+ begin
66
+ graph << profile_graph
67
+ graph << influence_graph
68
+ graph << film_graph
69
+ rescue RestClient::RequestTimeout
70
+ BibCard.logger.warn "DBPedia failed to respond. SPARQL query request timed out after 5 seconds for #{@current_query}."
71
+ rescue Exception => e
72
+ BibCard.logger.warn "DBPedia failed to respond. Processing data for SPARQL request: #{@current_query}. Error: #{e.message}"
73
+ end
74
+ graph
75
+ end
76
+
77
+ def influence_graph
78
+ graph = RDF::Graph.new
79
+ [:influences, :influenced].each do |relationship|
80
+ m = self.method(relationship)
81
+ m.call.each do |influence|
82
+ if relationship == :influences
83
+ field = "influence"
84
+ predicate = DBO_INFLUENCED_BY
85
+ else
86
+ field = "influenced"
87
+ predicate = DBO_INFLUENCED
88
+ end
89
+ influence_entity = RDF::URI.new(influence[field]["value"])
90
+ graph << [self.dbpedia_uri, predicate, influence_entity]
91
+ graph << [influence_entity, RDFS_LABEL, influence["#{field}Label"]["value"]]
92
+ if influence["#{field}GivenName"] and influence["#{field}Surname"]
93
+ graph << [influence_entity, FOAF_GIVEN_NAME, influence["#{field}GivenName"]["value"]]
94
+ graph << [influence_entity, FOAF_SURNAME, influence["#{field}Surname"]["value"]]
95
+ end
96
+ if influence["influenceSameAs"]
97
+ graph << [influence_entity, RDF::OWL.sameAs, influence["#{field}SameAs"]["value"]]
98
+ end
99
+ end
100
+ end
101
+ graph
102
+ end
103
+
104
+ def film_graph
105
+ @current_query = "film graph"
106
+ graph = RDF::Graph.new
107
+ self.film_appearances.each do |appearance|
108
+ film = RDF::URI.new(appearance["film"]["value"])
109
+ graph << [film, DBO_STARRING, self.dbpedia_uri]
110
+ graph << [film, RDF::RDFS.label, appearance["filmName"]["value"]]
111
+ graph << [film, DBO_ABSTRACT, appearance["filmAbstract"]["value"]]
112
+ end
113
+ graph
114
+ end
115
+
116
+ def profile_graph
117
+ @current_query = "profile graph"
118
+ graph = RDF::Graph.new
119
+ dbpedia_subject = self.dbpedia_uri
120
+ profile = self.dbpedia_profile
121
+ if profile
122
+ graph << [dbpedia_subject, DBO_ABSTRACT, profile["abstract"]["value"]] if profile["abstract"]
123
+ graph << [dbpedia_subject, DBP_FOUNDED, profile["foundedDate"]["value"]] if profile["foundedDate"]
124
+ graph << [dbpedia_subject, DBP_LOCATION, profile["location"]["value"]] if profile["location"]
125
+ graph << [dbpedia_subject, DBO_THUMBNAIL, profile["thumbnail"]["value"]] if profile["thumbnail"]
126
+ graph << [dbpedia_subject, FOAF_DEPICTION, profile["depiction"]["value"]] if profile["depiction"]
127
+ end
128
+ graph
129
+ end
130
+
131
+ def influences
132
+ @current_query = "influences graph"
133
+ sparql = "
134
+ #{self.dbpedia_sparql_prefixes}
135
+
136
+ SELECT DISTINCT ?influence ?influenceGivenName ?influenceSurname ?influenceSameAs ?influenceLabel
137
+ WHERE {
138
+ {
139
+ ?influence dbo:influenced <#{self.dbpedia_uri}> .
140
+ }
141
+ UNION
142
+ {
143
+ <#{self.dbpedia_uri}> dbo:influencedBy ?influence .
144
+ }
145
+ ?influence rdfs:label ?influenceLabel .
146
+ OPTIONAL {
147
+ ?influence foaf:givenName ?influenceGivenName .
148
+ ?influence foaf:surname ?influenceSurname .
149
+ }
150
+ OPTIONAL {
151
+ ?influence owl:sameAs ?influenceSameAs .
152
+ FILTER regex(STR(?influenceSameAs), \"viaf.org\").
153
+ }
154
+ FILTER (lang(?influenceLabel) = 'en')
155
+ }
156
+ "
157
+ get_data(sparql, :dbpedia)
158
+ end
159
+
160
+ def influenced
161
+ @current_query = "influence upon graph"
162
+ sparql = "
163
+ #{self.dbpedia_sparql_prefixes}
164
+
165
+ SELECT ?influenced ?influencedGivenName ?influencedSurname ?influencedSameAs ?influencedLabel
166
+ WHERE {
167
+ {
168
+ <#{self.dbpedia_uri}> dbo:influenced ?influenced .
169
+ }
170
+ UNION
171
+ {
172
+ ?influenced dbo:influencedBy <#{self.dbpedia_uri}> .
173
+ }
174
+ ?influenced rdfs:label ?influencedLabel .
175
+ OPTIONAL {
176
+ ?influenced foaf:givenName ?influencedGivenName .
177
+ ?influenced foaf:surname ?influencedSurname .
178
+ }
179
+ OPTIONAL {
180
+ ?influenced owl:sameAs ?influencedSameAs .
181
+ FILTER regex(STR(?influencedSameAs), \"viaf.org\").
182
+ }
183
+ FILTER (lang(?influencedLabel) = 'en')
184
+ }
185
+ "
186
+ get_data(sparql, :dbpedia)
187
+ end
188
+
189
+ def dbpedia_profile
190
+ sparql = "
191
+ #{self.dbpedia_sparql_prefixes}
192
+
193
+ SELECT ?abstract ?foundedDate ?location ?thumbnail ?depiction
194
+ WHERE {
195
+ OPTIONAL { <#{self.dbpedia_uri}> dbo:abstract ?abstract . }
196
+ OPTIONAL {<#{self.dbpedia_uri}> dbp:location ?location . }
197
+ OPTIONAL { <#{self.dbpedia_uri}> dbp:foundedDate ?foundedDate . }
198
+ OPTIONAL { <#{self.dbpedia_uri}> dbo:thumbnail ?thumbnail . }
199
+ OPTIONAL { <#{self.dbpedia_uri}> foaf:depiction ?depiction . }
200
+ FILTER(langMatches(lang(?abstract), \"en\"))
201
+ }
202
+ "
203
+ get_data(sparql, :dbpedia).first
204
+ end
205
+
206
+ def film_appearances
207
+ sparql = "
208
+ #{self.dbpedia_sparql_prefixes}
209
+
210
+ SELECT ?film ?filmName ?filmAbstract
211
+ WHERE {
212
+ ?film dbo:starring <#{self.dbpedia_uri}> .
213
+ ?film rdfs:label ?filmName .
214
+ ?film dbo:abstract ?filmAbstract .
215
+ FILTER(langMatches(lang(?filmName), \"en\"))
216
+ FILTER(langMatches(lang(?filmAbstract), \"en\"))
217
+ }
218
+ "
219
+ get_data(sparql, :dbpedia)
220
+ end
221
+
222
+ def getty_note_graph
223
+ @current_query = "getty note graph"
224
+ graph = RDF::Graph.new
225
+ begin
226
+ getty_subject = self.getty_uri
227
+ self.getty_scope_notes.each do |scope_note|
228
+ # Add the scope note itself
229
+ scope_note_uri = RDF::URI.new(scope_note["scopeNote"]["value"])
230
+ graph << [getty_subject, SKOS_SCOPE_NOTE, scope_note_uri]
231
+ graph << [scope_note_uri, RDF.value, scope_note["scopeNoteValue"]["value"]]
232
+
233
+ # Add the sources/citations for the scope note
234
+ source_uri = RDF::URI.new(scope_note["source"]["value"])
235
+ graph << [scope_note_uri, DC_SOURCE, source_uri]
236
+ if scope_note["sourceShortTitle"]
237
+ graph << [source_uri, BIBO_SHORT_TITLE, scope_note["sourceShortTitle"]["value"]]
238
+ else
239
+ parent_uri = RDF::URI.new(scope_note["parent"]["value"])
240
+ graph << [source_uri, DC_IS_PART_OF, parent_uri]
241
+ graph << [source_uri, RDF.type, BIBO_DOCUMENT_PART]
242
+ graph << [parent_uri, BIBO_SHORT_TITLE, scope_note["parentShortTitle"]["value"]]
243
+ end
244
+ end
245
+ rescue RestClient::RequestTimeout
246
+ BibCard.logger.warn "Getty failed to respond. SPARQL query request timed out after 5 seconds for #{@current_query}."
247
+ rescue Exception => e
248
+ BibCard.logger.warn "Getty failed to respond. Processing data for SPARQL request: #{@current_query}. Error: #{e.message}"
249
+ end
250
+ graph
251
+ end
252
+
253
+ def getty_scope_notes
254
+ sparql = "
255
+ PREFIX ulan: <http://vocab.getty.edu/ulan/>
256
+ PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
257
+ PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
258
+ PREFIX dct: <http://purl.org/dc/terms/>
259
+ PREFIX bibo: <http://purl.org/ontology/bibo/>
260
+
261
+ SELECT ?scopeNote ?scopeNoteValue ?source ?sourceShortTitle ?parent ?parentShortTitle
262
+ WHERE {
263
+ <#{self.getty_uri.to_s}> skos:scopeNote ?scopeNote .
264
+ ?scopeNote rdf:value ?scopeNoteValue .
265
+ ?scopeNote dct:source ?source .
266
+ OPTIONAL { ?source bibo:shortTitle ?sourceShortTitle . }
267
+ OPTIONAL {
268
+ ?source dct:isPartOf ?parent .
269
+ ?parent bibo:shortTitle ?parentShortTitle .
270
+ }
271
+ }
272
+ "
273
+ get_data(sparql, :getty)
274
+ end
275
+
276
+ def wikidata_graph
277
+ graph = RDF::Graph.new
278
+ begin
279
+ wikidata_subject = self.wikidata_uri
280
+ self.alma_maters.each do |alma_mater|
281
+ @current_query = "alma maters graph"
282
+ am_inst_uri = RDF::URI.new(alma_mater["inst"]["value"])
283
+ am_edu_stmt = RDF::URI.new(alma_mater["statement"]["value"])
284
+
285
+ graph << [wikidata_subject, WDT_EDUCATED_AT, am_inst_uri]
286
+ graph << [am_inst_uri, RDF::RDFS.label, alma_mater["instLabel"]["value"]]
287
+ graph << [wikidata_subject, WDP_EDUCATED_AT, am_edu_stmt]
288
+ graph << [am_edu_stmt, WDPS_STMT_EDU_AT, am_inst_uri]
289
+
290
+ # Not all assertions have references/citations
291
+ if alma_mater["reference"]
292
+ am_stmt_ref = RDF::URI.new(alma_mater["reference"]["value"])
293
+ am_ref_source = RDF::URI.new(alma_mater["source"]["value"])
294
+
295
+ graph << [am_edu_stmt, PROV_DERIVED_FROM, am_stmt_ref]
296
+ graph << [am_stmt_ref, WDR_STATED_IN, am_ref_source]
297
+ graph << [am_ref_source, RDF::RDFS.label, alma_mater["sourceLabel"]["value"]]
298
+ end
299
+ end
300
+
301
+ bio = self.brief_bio
302
+ if bio
303
+ @current_query = "brief bio graph"
304
+ graph << [wikidata_subject, SCHEMA_DESCRIPTION, bio["description"]["value"]] if bio["description"]
305
+ if bio["workLocation"]
306
+ work_loc_uri = RDF::URI.new(bio["workLocation"]["value"])
307
+ graph << [wikidata_subject, WDT_WORK_LOCATION, work_loc_uri]
308
+ graph << [work_loc_uri, RDF::RDFS.label, bio["workLocationLabel"]["value"]]
309
+ end
310
+ end
311
+
312
+ self.notable_works.each do |work|
313
+ @current_query = "notable works graph"
314
+ work_uri = RDF::URI.new(work["notableWork"]["value"])
315
+ graph << [wikidata_subject, WDT_NOTABLE_WORKS, work_uri]
316
+ graph << [work_uri, RDF::RDFS.label, work["notableWorkLabel"]["value"]]
317
+ graph << [work_uri, WDT_ISBN, work["isbn"]["value"]] if work["isbn"]
318
+ graph << [work_uri, WDT_OCLC_NUMBER, work["oclcNumber"]["value"]] if work["oclcNumber"]
319
+ end
320
+ rescue RestClient::RequestTimeout
321
+ BibCard.logger.warn "WikiData failed to respond. SPARQL query request timed out after 5 seconds for #{@current_query}."
322
+ rescue Exception => e
323
+ BibCard.logger.warn "WikiData failed to respond. Processing data for SPARQL request: #{@current_query}. Error: #{e.message}"
324
+ end
325
+ graph
326
+ end
327
+
328
+ def alma_maters
329
+ sparql = "
330
+ #{self.wikidata_sparql_prefixes}
331
+
332
+ SELECT DISTINCT ?inst ?instLabel ?statement ?reference ?source ?sourceLabel
333
+ WHERE
334
+ {
335
+ <#{self.wikidata_uri.to_s}> p:P69 ?statement .
336
+ ?statement ps:P69 ?inst .
337
+ ?inst rdfs:label ?instLabel .
338
+ FILTER(langMatches(lang(?instLabel), \"en\"))
339
+ OPTIONAL {
340
+ ?statement prov:wasDerivedFrom ?reference .
341
+ ?reference pref:P248 ?source .
342
+ ?source rdfs:label ?sourceLabel .
343
+ FILTER(langMatches(lang(?sourceLabel), \"en\"))
344
+ }
345
+ }
346
+ "
347
+ get_data(sparql, :wikidata)
348
+ end
349
+
350
+ def brief_bio
351
+ sparql = "
352
+ #{self.wikidata_sparql_prefixes}
353
+
354
+ SELECT DISTINCT ?description ?workLocation ?workLocationLabel
355
+ WHERE
356
+ {
357
+ <#{self.wikidata_uri.to_s}> schema:description ?description .
358
+ OPTIONAL {
359
+ <#{self.wikidata_uri.to_s}> wdt:P937 ?workLocation .
360
+ }
361
+ SERVICE wikibase:label {
362
+ bd:serviceParam wikibase:language \"en\" .
363
+ }
364
+ FILTER(langMatches(lang(?description), \"en\"))
365
+ }
366
+ "
367
+ get_data(sparql, :wikidata).first
368
+ end
369
+
370
+ def notable_works
371
+ sparql = "
372
+ #{self.wikidata_sparql_prefixes}
373
+
374
+ SELECT DISTINCT ?notableWork ?notableWorkLabel ?isbn ?oclcNumber
375
+ WHERE
376
+ {
377
+ <#{self.wikidata_uri.to_s}> wdt:P800 ?notableWork .
378
+ OPTIONAL {
379
+ ?notableWork wdt:P212 ?isbn .
380
+ ?notableWork wdt:P243 ?oclcNumber .
381
+ }
382
+ SERVICE wikibase:label {
383
+ bd:serviceParam wikibase:language \"en\" .
384
+ }
385
+ }
386
+ "
387
+ notable_works = get_data(sparql, :wikidata)
388
+ notable_works.select {|work| work["notableWorkLabel"] != nil and !work["notableWorkLabel"]["value"].match(/^Q\d+$/)}
389
+ end
390
+
391
+ protected
392
+
393
+ def get_data(sparql, source)
394
+ url = SPARQL_ENDPOINTS[source] + URI::encode_www_form_component(sparql.gsub(/\n/, ' '))
395
+ data = RestClient::Request.execute(method: :get, url: url, headers: {accept: "application/sparql-results+json"}, timeout: 5)
396
+ parsed_data = JSON.parse data
397
+ parsed_data["results"]["bindings"]
398
+ end
399
+
400
+ def wikidata_sparql_prefixes
401
+ "
402
+ PREFIX wikibase: <http://wikiba.se/ontology#>
403
+ PREFIX p: <http://www.wikidata.org/prop/>
404
+ PREFIX pref: <http://www.wikidata.org/prop/reference/>
405
+ PREFIX ps: <http://www.wikidata.org/prop/statement/>
406
+ PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
407
+ "
408
+ end
409
+
410
+ def dbpedia_sparql_prefixes
411
+ "
412
+ PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
413
+ PREFIX owl: <http://www.w3.org/2002/07/owl#>
414
+ PREFIX foaf: <http://xmlns.com/foaf/0.1/>
415
+ PREFIX dbo: <http://dbpedia.org/ontology/>
416
+ "
417
+ end
418
+
419
+ end
420
+ end