bib_card 0.5.0
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +7 -0
- data/.gitignore +10 -0
- data/.rspec +3 -0
- data/.travis.yml +4 -0
- data/CODE_OF_CONDUCT.md +13 -0
- data/Gemfile +4 -0
- data/LICENSE.txt +21 -0
- data/README.md +88 -0
- data/Rakefile +6 -0
- data/bib_card.gemspec +42 -0
- data/bin/console +14 -0
- data/bin/setup +7 -0
- data/lib/bib_card.rb +112 -0
- data/lib/bib_card/author.rb +5 -0
- data/lib/bib_card/crawl_exception.rb +9 -0
- data/lib/bib_card/crawler.rb +420 -0
- data/lib/bib_card/db_pedia/resource.rb +34 -0
- data/lib/bib_card/entity_not_found_exception.rb +11 -0
- data/lib/bib_card/getty/scope_note.rb +18 -0
- data/lib/bib_card/getty/source.rb +15 -0
- data/lib/bib_card/getty/subject.rb +11 -0
- data/lib/bib_card/invalid_uri_exception.rb +11 -0
- data/lib/bib_card/person.rb +65 -0
- data/lib/bib_card/railtie.rb +8 -0
- data/lib/bib_card/uris.rb +76 -0
- data/lib/bib_card/version.rb +3 -0
- data/lib/bib_card/wikidata/entity.rb +28 -0
- metadata +242 -0
checksums.yaml
ADDED
@@ -0,0 +1,7 @@
|
|
1
|
+
---
|
2
|
+
SHA256:
|
3
|
+
metadata.gz: 4ae3e321e881cca41e3c901fdaa406fdad13bb4942e1e5fc5aba0ed9978aa510
|
4
|
+
data.tar.gz: 93fe803b13025531d593fb20ad0e8df175c31f9286fa87505696e48ca49d33e4
|
5
|
+
SHA512:
|
6
|
+
metadata.gz: 228e39fc5badb23ad83bffbcb0ddec432f245b7ba92f0fea584cc29e1edb1c6b677a399597acda26f99d70a99576d081634df0b09a34972d96cd13a298f2adde
|
7
|
+
data.tar.gz: 38651eac526d014b72f914cbf457469125a5229a1a6ad75e3d1caf7bc8bae91721fbfc5de9bff27e83de7d7bf74dcacf4a53b3357a09d9f8d6e219dbc6ee9bd9
|
data/.gitignore
ADDED
data/.rspec
ADDED
data/.travis.yml
ADDED
data/CODE_OF_CONDUCT.md
ADDED
@@ -0,0 +1,13 @@
|
|
1
|
+
# Contributor Code of Conduct
|
2
|
+
|
3
|
+
As contributors and maintainers of this project, we pledge to respect all people who contribute through reporting issues, posting feature requests, updating documentation, submitting pull requests or patches, and other activities.
|
4
|
+
|
5
|
+
We are committed to making participation in this project a harassment-free experience for everyone, regardless of level of experience, gender, gender identity and expression, sexual orientation, disability, personal appearance, body size, race, ethnicity, age, or religion.
|
6
|
+
|
7
|
+
Examples of unacceptable behavior by participants include the use of sexual language or imagery, derogatory comments or personal attacks, trolling, public or private harassment, insults, or other unprofessional conduct.
|
8
|
+
|
9
|
+
Project maintainers have the right and responsibility to remove, edit, or reject comments, commits, code, wiki edits, issues, and other contributions that are not aligned to this Code of Conduct. Project maintainers who do not follow the Code of Conduct may be removed from the project team.
|
10
|
+
|
11
|
+
Instances of abusive, harassing, or otherwise unacceptable behavior may be reported by opening an issue or contacting one or more of the project maintainers.
|
12
|
+
|
13
|
+
This Code of Conduct is adapted from the [Contributor Covenant](http://contributor-covenant.org), version 1.0.0, available at [http://contributor-covenant.org/version/1/0/0/](http://contributor-covenant.org/version/1/0/0/)
|
data/Gemfile
ADDED
data/LICENSE.txt
ADDED
@@ -0,0 +1,21 @@
|
|
1
|
+
The MIT License (MIT)
|
2
|
+
|
3
|
+
Copyright (c) 2016 University of Wisconsin Board of Regents
|
4
|
+
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
7
|
+
in the Software without restriction, including without limitation the rights
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
10
|
+
furnished to do so, subject to the following conditions:
|
11
|
+
|
12
|
+
The above copyright notice and this permission notice shall be included in
|
13
|
+
all copies or substantial portions of the Software.
|
14
|
+
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
|
21
|
+
THE SOFTWARE.
|
data/README.md
ADDED
@@ -0,0 +1,88 @@
|
|
1
|
+
# BibCard
|
2
|
+
|
3
|
+
BibCard is a Ruby library for retrieving and assembling knowledge card information about the authors found in bibliographic data. It takes identifiers like Library of Congress Name Authority File (LCNAF) or VIAF URIs as input and crawls Linked Open Data sources on the web to assemble a Ruby objects or RDF serializations. This library will fetch data from:
|
4
|
+
|
5
|
+
* [Virtual International Authority File (VIAF)](http://viaf.org/)
|
6
|
+
* [Wikidata](https://www.wikidata.org/wiki/Wikidata:Main_Page)
|
7
|
+
* [DBpedia](http://wiki.dbpedia.org/)
|
8
|
+
* [Getty Vocabularies LOD](http://vocab.getty.edu/)
|
9
|
+
|
10
|
+
The VIAF URI lies at the core of the `BibCard::Person` object because it acts as a hub to many other data sources on the Web. With the VIAF data in hand the other three sources listed above are "crawled" for more information about a given identity. Technically the data is requested by making one or more HTTP requests to each of the data sources' public SPARQL endpoints.
|
11
|
+
|
12
|
+
`BibCard` makes extensive use of the [Spira](https://github.com/ruby-rdf/spira) library for RDF-to-object mapping. The result is that after assembling a micrograph of knowledge card data the client can work with simple code objects.
|
13
|
+
|
14
|
+
## Installation
|
15
|
+
|
16
|
+
This gem is not yet in rubygems. Until the tires are kicked a few more times please use this command line install.
|
17
|
+
|
18
|
+
```bash
|
19
|
+
$ git clone https://github.com/UW-Madison-Library/bibcard
|
20
|
+
$ cd bibcard
|
21
|
+
$ bundle install
|
22
|
+
$ gem build bib_card.gemspec
|
23
|
+
$ gem install bib_card-<VERSION-NUMBER>.gem
|
24
|
+
```
|
25
|
+
|
26
|
+
## Usage
|
27
|
+
|
28
|
+
### Instantiate a `BibCard::Person`
|
29
|
+
|
30
|
+
Given a Library of Congress Name Authority File or VIAF URI, instantiate a `BibCard::Person` and inspect the data.
|
31
|
+
|
32
|
+
*Note:* Every call to to `BibCard.person()` will make many calls to the public SPARQL endpoints for the sources cited above.
|
33
|
+
|
34
|
+
```ruby
|
35
|
+
require 'bib_card'
|
36
|
+
|
37
|
+
lcnaf_uri = "http://id.loc.gov/authorities/names/n78086005"
|
38
|
+
person = BibCard.person(lcnaf_uri)
|
39
|
+
|
40
|
+
person.name(["en", "en-US"]) # => "Pablo Picasso"
|
41
|
+
person.birth_date # => "1881-10-25"
|
42
|
+
person.death_date # => "1973-04-09"
|
43
|
+
|
44
|
+
person.dbpedia_resource # => <BibCard::DBPedia::Resource:70307318111440 @subject: http://dbpedia.org/resource/Pablo_Picasso>
|
45
|
+
person.dbpedia_resource.abstract # => "Pablo Ruiz y Picasso, also known as Pablo Picasso (/pɪˈkɑːsoʊ, -ˈkæsoʊ/; Spanish: [ˈpaβlo piˈkaso]; 25 October 1881 – 8 April 1973), was a Spanish painter..."
|
46
|
+
|
47
|
+
person.getty_subject # => <BibCard::Getty::Subject:70307331508400 @subject: http://vocab.getty.edu/ulan/500009666>
|
48
|
+
person.getty_subject.scope_note # => <BibCard::Getty::ScopeNote:70307331409520 @subject: http://vocab.getty.edu/ulan/scopeNote/53649>
|
49
|
+
person.getty_subject.scope_note.value # => "Long-lived and very influential Spanish artist, active in France. He dominated 20th-century European art. With Georges Braque, he is credited with inventing Cubism."
|
50
|
+
person.getty_subject.scope_note.sources # => [<BibCard::Getty::Source:70307327167300 @subject: http://vocab.getty.edu/ulan/source/2100153925>, <BibCard::Getty::Source:70307327106100 @subject: http://vocab.getty.edu/ulan/source/2100156698>]
|
51
|
+
person.getty_subject.scope_note.sources.map {|source| source.short_title} # => ["LCNAF Library of Congress Name Authority File [n.d.]", "Grove Dictionary of Art online (1999-2002)"]
|
52
|
+
```
|
53
|
+
|
54
|
+
### Fetch Raw Data for a `BibCard::Person`
|
55
|
+
|
56
|
+
A BibCard knowledge/info card is generated from many different sources, which is inherently slow. You can also retrieve person data as a serialized string of RDF n-triples. The raw data is available so that it can be cached locally. Once the data is cached you can load a [Spira](https://github.com/ruby-rdf/spira) repository and instantiate a `BibCard::Person` object.
|
57
|
+
|
58
|
+
```ruby
|
59
|
+
require 'bib_card'
|
60
|
+
|
61
|
+
lcnaf_uri = "http://id.loc.gov/authorities/names/n78086005"
|
62
|
+
data = BibCard.person_data(lcnaf_uri)
|
63
|
+
puts data
|
64
|
+
|
65
|
+
# <http://viaf.org/viaf/15873> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/Person> .
|
66
|
+
# <http://viaf.org/viaf/15873> <http://schema.org/deathDate> "1973-04-09" .
|
67
|
+
# <http://viaf.org/viaf/15873> <http://schema.org/sameAs> <http://id.loc.gov/authorities/names/n78086005> .
|
68
|
+
# ...
|
69
|
+
|
70
|
+
#### cache the serialized data ####
|
71
|
+
|
72
|
+
Spira.repository = RDF::Repository.new.from_ntriples(data)
|
73
|
+
viaf_uri = Spira.repository.query(predicate: BibCard::SCHEMA_SAME_AS, object: RDF::URI.new(lcnaf_uri)).first.subject
|
74
|
+
person = viaf_uri.as(BibCard::Person)
|
75
|
+
|
76
|
+
person # => <BibCard::Person:70307327106900 @subject: http://viaf.org/viaf/15873>
|
77
|
+
person.name(["en", "en-US"]) # => "Pablo Picasso"
|
78
|
+
```
|
79
|
+
|
80
|
+
## Contributing
|
81
|
+
|
82
|
+
Bug reports and pull requests are welcome on GitHub at https://github.com/UW-Madison-Library/bibcard. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the [Contributor Covenant](http://contributor-covenant.org) code of conduct.
|
83
|
+
|
84
|
+
|
85
|
+
## License
|
86
|
+
|
87
|
+
The gem is available as open source under the terms of the [MIT License](http://opensource.org/licenses/MIT).
|
88
|
+
|
data/Rakefile
ADDED
data/bib_card.gemspec
ADDED
@@ -0,0 +1,42 @@
|
|
1
|
+
# coding: utf-8
|
2
|
+
lib = File.expand_path('../lib', __FILE__)
|
3
|
+
$LOAD_PATH.unshift(lib) unless $LOAD_PATH.include?(lib)
|
4
|
+
require 'bib_card/version'
|
5
|
+
|
6
|
+
Gem::Specification.new do |spec|
|
7
|
+
spec.name = "bib_card"
|
8
|
+
spec.version = BibCard::VERSION
|
9
|
+
spec.authors = ["Steve Meyer"]
|
10
|
+
spec.email = ["stephen.meyer@wisc.edu"]
|
11
|
+
|
12
|
+
spec.summary = %q{Library Linked Data for building knowledge cards.}
|
13
|
+
spec.description = %q{Given a URI for a bibliographic author entity, assemble useful information for producing a knowledge card.}
|
14
|
+
spec.homepage = "https://github.com/UW-Madison-Library/bibcard.git"
|
15
|
+
spec.license = "MIT"
|
16
|
+
|
17
|
+
# Prevent pushing this gem to RubyGems.org by setting 'allowed_push_host', or
|
18
|
+
# delete this section to allow pushing this gem to any host.
|
19
|
+
# if spec.respond_to?(:metadata)
|
20
|
+
# spec.metadata['allowed_push_host'] = "TODO: Set to 'http://mygemserver.com'"
|
21
|
+
# else
|
22
|
+
# raise "RubyGems 2.0 or newer is required to protect against public gem pushes."
|
23
|
+
# end
|
24
|
+
|
25
|
+
spec.files = `git ls-files -z`.split("\x0").reject { |f| f.match(%r{^(test|spec|features)/}) }
|
26
|
+
spec.bindir = "exe"
|
27
|
+
spec.executables = spec.files.grep(%r{^exe/}) { |f| File.basename(f) }
|
28
|
+
spec.require_paths = ["lib"]
|
29
|
+
|
30
|
+
spec.add_runtime_dependency "rdf", "~> 3.0", ">= 3.0.1"
|
31
|
+
spec.add_runtime_dependency "rdf-rdfxml", "~> 3.1.0"
|
32
|
+
spec.add_runtime_dependency "spira", "~> 3.0"
|
33
|
+
spec.add_runtime_dependency "rest-client", '~> 2.0.2'
|
34
|
+
spec.add_runtime_dependency "nokogiri", "~> 1.11.1"
|
35
|
+
spec.add_runtime_dependency "equivalent-xml", "~> 0.6"
|
36
|
+
|
37
|
+
spec.add_development_dependency "bundler", "~> 2.2.5"
|
38
|
+
spec.add_development_dependency "rake", "~> 13.0.1"
|
39
|
+
spec.add_development_dependency "rspec", "~> 3.4"
|
40
|
+
spec.add_development_dependency "simplecov", "~> 0.11", ">= 0.11.2"
|
41
|
+
spec.add_development_dependency "webmock", "~> 2.0", ">= 2.0.3"
|
42
|
+
end
|
data/bin/console
ADDED
@@ -0,0 +1,14 @@
|
|
1
|
+
#!/usr/bin/env ruby
|
2
|
+
|
3
|
+
require "bundler/setup"
|
4
|
+
require "bib_card"
|
5
|
+
|
6
|
+
# You can add fixtures and/or initialization code here to make experimenting
|
7
|
+
# with your gem easier. You can also use a different console, if you like.
|
8
|
+
|
9
|
+
# (If you use this, don't forget to add pry to your Gemfile!)
|
10
|
+
# require "pry"
|
11
|
+
# Pry.start
|
12
|
+
|
13
|
+
require "irb"
|
14
|
+
IRB.start
|
data/bin/setup
ADDED
data/lib/bib_card.rb
ADDED
@@ -0,0 +1,112 @@
|
|
1
|
+
require "openssl"
|
2
|
+
require "rdf"
|
3
|
+
require "rdf/rdfxml"
|
4
|
+
require "rdf/xsd"
|
5
|
+
require "spira"
|
6
|
+
require "rest-client"
|
7
|
+
require "json"
|
8
|
+
|
9
|
+
require "bib_card/version"
|
10
|
+
require "bib_card/uris"
|
11
|
+
require "bib_card/author"
|
12
|
+
require "bib_card/person"
|
13
|
+
require "bib_card/crawler"
|
14
|
+
require "bib_card/invalid_uri_exception"
|
15
|
+
require "bib_card/entity_not_found_exception"
|
16
|
+
require "bib_card/crawl_exception"
|
17
|
+
require "bib_card/db_pedia/resource"
|
18
|
+
require "bib_card/getty/scope_note"
|
19
|
+
require "bib_card/getty/source"
|
20
|
+
require "bib_card/getty/subject"
|
21
|
+
require "bib_card/wikidata/entity"
|
22
|
+
|
23
|
+
module BibCard
|
24
|
+
|
25
|
+
class << self
|
26
|
+
attr_writer :logger
|
27
|
+
|
28
|
+
def logger
|
29
|
+
@logger ||= Logger.new($stdout).tap do |logger|
|
30
|
+
logger.progname = self.name
|
31
|
+
logger.formatter = proc do |severity, time, progname, msg|
|
32
|
+
"#{severity} [#{time.strftime('%Y-%m-%d %H:%M:%S.%L')}] #{progname}: #{msg}\n"
|
33
|
+
end
|
34
|
+
end
|
35
|
+
end
|
36
|
+
|
37
|
+
def person_data(uri)
|
38
|
+
graph, viaf_uri = creator_graph_and_viaf_uri(uri)
|
39
|
+
graph.dump(:ntriples)
|
40
|
+
end
|
41
|
+
|
42
|
+
def person(uri)
|
43
|
+
graph, viaf_uri = creator_graph_and_viaf_uri(uri)
|
44
|
+
Spira.repository = graph
|
45
|
+
viaf_uri.as(Person)
|
46
|
+
end
|
47
|
+
|
48
|
+
def viaf_uri?(uri)
|
49
|
+
url = uri.to_s
|
50
|
+
url.match(/^http:\/\/viaf\.org\/viaf\/\d+$/).nil? ? false : true
|
51
|
+
end
|
52
|
+
|
53
|
+
def lcnaf_uri?(uri)
|
54
|
+
url = uri.to_s
|
55
|
+
url.match(/^http:\/\/id\.loc\.gov\/authorities\/names\/n[bors]{0,1}\d+$/).nil? ? false : true
|
56
|
+
end
|
57
|
+
|
58
|
+
private
|
59
|
+
|
60
|
+
def creator_graph_and_viaf_uri(uri)
|
61
|
+
# Convert the URI to an RDF::URI object if it is not already
|
62
|
+
uri = convert_uri(uri)
|
63
|
+
|
64
|
+
# 1. Get the VIAF data and determine the VIAF URI
|
65
|
+
begin
|
66
|
+
if lcnaf_uri?(uri)
|
67
|
+
# Load the VIAF data graph and determine the VIAF URI based on the LCNAF URI.
|
68
|
+
identifier = lcnaf_uri_to_identifier(uri)
|
69
|
+
viaf_url = "http://viaf.org/viaf/sourceID/" + URI.encode_www_form_component("LC|#{identifier}")
|
70
|
+
viaf_graph = RDF::Graph.load(viaf_url, format: :rdfxml)
|
71
|
+
viaf_uri = viaf_graph.query({predicate: SCHEMA_SAME_AS, object: uri}).first.subject
|
72
|
+
elsif viaf_uri?(uri)
|
73
|
+
# Load the VIAF data graph using the URI
|
74
|
+
viaf_uri = uri
|
75
|
+
viaf_graph = RDF::Graph.load(uri, format: :rdfxml)
|
76
|
+
else
|
77
|
+
raise BibCard::InvalidURIException
|
78
|
+
end
|
79
|
+
rescue IOError
|
80
|
+
raise BibCard::EntityNotFoundException
|
81
|
+
rescue Errno::ECONNRESET
|
82
|
+
raise BibCard::CrawlException.new("Unable to access VIAF, connection reset by peer.")
|
83
|
+
rescue NoMethodError => e
|
84
|
+
undifferentiated_uri_msg = "This VIAF URI has been corrupted by an 'undifferentiate name' and should be treated as unusable."
|
85
|
+
results = viaf_graph.query({predicate: RDFS_COMMENT, object: undifferentiated_uri_msg})
|
86
|
+
if results.size > 0
|
87
|
+
raise BibCard::EntityNotFoundException.new(undifferentiated_uri_msg)
|
88
|
+
else
|
89
|
+
raise e
|
90
|
+
end
|
91
|
+
end
|
92
|
+
|
93
|
+
# 2. Crawl and use it as a basis for crawling the other data sources
|
94
|
+
crawler = Crawler.new(viaf_uri, viaf_graph)
|
95
|
+
graph = crawler.creator_graph
|
96
|
+
[graph, viaf_uri]
|
97
|
+
end
|
98
|
+
|
99
|
+
def lcnaf_uri_to_identifier(uri)
|
100
|
+
url = uri.to_s
|
101
|
+
url.gsub("http://id.loc.gov/authorities/names/", "")
|
102
|
+
end
|
103
|
+
|
104
|
+
# Convert
|
105
|
+
def convert_uri(uri)
|
106
|
+
uri.is_a?(RDF::URI) ? uri : RDF::URI.new(uri)
|
107
|
+
end
|
108
|
+
end
|
109
|
+
end
|
110
|
+
|
111
|
+
# Rails support
|
112
|
+
require 'bib_card/railtie' if defined?(Rails)
|
@@ -0,0 +1,420 @@
|
|
1
|
+
module BibCard
|
2
|
+
class Crawler
|
3
|
+
|
4
|
+
def initialize(uri, repository)
|
5
|
+
@subject = RDF::URI(uri)
|
6
|
+
@repository = repository
|
7
|
+
end
|
8
|
+
|
9
|
+
SPARQL_ENDPOINTS = {
|
10
|
+
getty: "http://vocab.getty.edu/sparql?query=",
|
11
|
+
wikidata: "http://query.wikidata.org/sparql?query=",
|
12
|
+
dbpedia: "http://dbpedia.org/sparql?query="
|
13
|
+
}
|
14
|
+
|
15
|
+
def birth_date
|
16
|
+
stmt = @repository.query({subject: @subject, predicate: SCHEMA_BIRTHDATE}).first
|
17
|
+
stmt.nil? ? nil : stmt.object
|
18
|
+
end
|
19
|
+
|
20
|
+
def death_date
|
21
|
+
stmt = @repository.query({subject: @subject, predicate: SCHEMA_DEATHDATE}).first
|
22
|
+
stmt.nil? ? nil : stmt.object
|
23
|
+
end
|
24
|
+
|
25
|
+
def loc_uri
|
26
|
+
stmt = @repository.query({subject: @subject, predicate: SCHEMA_SAME_AS}).select {|s| s.object.to_s.match('http://id.loc.gov/authorities/names/')}.first
|
27
|
+
stmt.nil? ? nil : stmt.object
|
28
|
+
end
|
29
|
+
|
30
|
+
def dbpedia_uri
|
31
|
+
stmt = @repository.query({subject: @subject, predicate: SCHEMA_SAME_AS}).select {|s| s.object.to_s.match('http://dbpedia.org/resource')}.first
|
32
|
+
stmt.nil? ? nil : stmt.object
|
33
|
+
end
|
34
|
+
|
35
|
+
def getty_uri
|
36
|
+
stmt = @repository.query({subject: @subject, predicate: SCHEMA_SAME_AS}).select {|s| s.object.to_s.match('vocab.getty.edu')}.first
|
37
|
+
stmt.nil? ? nil : RDF::URI.new( stmt.object.to_s.gsub('-agent', '') )
|
38
|
+
end
|
39
|
+
|
40
|
+
def wikidata_uri
|
41
|
+
stmt = @repository.query({subject: @subject, predicate: SCHEMA_SAME_AS}).select {|s| s.object.to_s.match('http://www.wikidata.org/entity')}.first
|
42
|
+
stmt.nil? ? nil : stmt.object
|
43
|
+
end
|
44
|
+
|
45
|
+
def creator_graph
|
46
|
+
graph = RDF::Graph.new
|
47
|
+
if @repository.size > 0
|
48
|
+
@repository.query({subject: @subject, predicate: RDF.type}).each {|stmt| graph << stmt}
|
49
|
+
@repository.query({subject: @subject, predicate: SCHEMA_NAME}).each {|stmt| graph << stmt}
|
50
|
+
graph << [@subject, SCHEMA_BIRTHDATE, self.birth_date] if self.birth_date
|
51
|
+
graph << [@subject, SCHEMA_DEATHDATE, self.death_date] if self.death_date
|
52
|
+
graph << [@subject, SCHEMA_SAME_AS, self.loc_uri] if self.loc_uri
|
53
|
+
graph << [@subject, SCHEMA_SAME_AS, self.dbpedia_uri] if self.dbpedia_uri
|
54
|
+
graph << [@subject, SCHEMA_SAME_AS, self.getty_uri] if self.getty_uri
|
55
|
+
graph << [@subject, SCHEMA_SAME_AS, self.wikidata_uri] if self.wikidata_uri
|
56
|
+
graph << dbpedia_graph if self.dbpedia_uri
|
57
|
+
graph << getty_note_graph if self.getty_uri
|
58
|
+
graph << wikidata_graph if self.wikidata_uri
|
59
|
+
end
|
60
|
+
graph
|
61
|
+
end
|
62
|
+
|
63
|
+
def dbpedia_graph
|
64
|
+
graph = RDF::Graph.new
|
65
|
+
begin
|
66
|
+
graph << profile_graph
|
67
|
+
graph << influence_graph
|
68
|
+
graph << film_graph
|
69
|
+
rescue RestClient::RequestTimeout
|
70
|
+
BibCard.logger.warn "DBPedia failed to respond. SPARQL query request timed out after 5 seconds for #{@current_query}."
|
71
|
+
rescue Exception => e
|
72
|
+
BibCard.logger.warn "DBPedia failed to respond. Processing data for SPARQL request: #{@current_query}. Error: #{e.message}"
|
73
|
+
end
|
74
|
+
graph
|
75
|
+
end
|
76
|
+
|
77
|
+
def influence_graph
|
78
|
+
graph = RDF::Graph.new
|
79
|
+
[:influences, :influenced].each do |relationship|
|
80
|
+
m = self.method(relationship)
|
81
|
+
m.call.each do |influence|
|
82
|
+
if relationship == :influences
|
83
|
+
field = "influence"
|
84
|
+
predicate = DBO_INFLUENCED_BY
|
85
|
+
else
|
86
|
+
field = "influenced"
|
87
|
+
predicate = DBO_INFLUENCED
|
88
|
+
end
|
89
|
+
influence_entity = RDF::URI.new(influence[field]["value"])
|
90
|
+
graph << [self.dbpedia_uri, predicate, influence_entity]
|
91
|
+
graph << [influence_entity, RDFS_LABEL, influence["#{field}Label"]["value"]]
|
92
|
+
if influence["#{field}GivenName"] and influence["#{field}Surname"]
|
93
|
+
graph << [influence_entity, FOAF_GIVEN_NAME, influence["#{field}GivenName"]["value"]]
|
94
|
+
graph << [influence_entity, FOAF_SURNAME, influence["#{field}Surname"]["value"]]
|
95
|
+
end
|
96
|
+
if influence["influenceSameAs"]
|
97
|
+
graph << [influence_entity, RDF::OWL.sameAs, influence["#{field}SameAs"]["value"]]
|
98
|
+
end
|
99
|
+
end
|
100
|
+
end
|
101
|
+
graph
|
102
|
+
end
|
103
|
+
|
104
|
+
def film_graph
|
105
|
+
@current_query = "film graph"
|
106
|
+
graph = RDF::Graph.new
|
107
|
+
self.film_appearances.each do |appearance|
|
108
|
+
film = RDF::URI.new(appearance["film"]["value"])
|
109
|
+
graph << [film, DBO_STARRING, self.dbpedia_uri]
|
110
|
+
graph << [film, RDF::RDFS.label, appearance["filmName"]["value"]]
|
111
|
+
graph << [film, DBO_ABSTRACT, appearance["filmAbstract"]["value"]]
|
112
|
+
end
|
113
|
+
graph
|
114
|
+
end
|
115
|
+
|
116
|
+
def profile_graph
|
117
|
+
@current_query = "profile graph"
|
118
|
+
graph = RDF::Graph.new
|
119
|
+
dbpedia_subject = self.dbpedia_uri
|
120
|
+
profile = self.dbpedia_profile
|
121
|
+
if profile
|
122
|
+
graph << [dbpedia_subject, DBO_ABSTRACT, profile["abstract"]["value"]] if profile["abstract"]
|
123
|
+
graph << [dbpedia_subject, DBP_FOUNDED, profile["foundedDate"]["value"]] if profile["foundedDate"]
|
124
|
+
graph << [dbpedia_subject, DBP_LOCATION, profile["location"]["value"]] if profile["location"]
|
125
|
+
graph << [dbpedia_subject, DBO_THUMBNAIL, profile["thumbnail"]["value"]] if profile["thumbnail"]
|
126
|
+
graph << [dbpedia_subject, FOAF_DEPICTION, profile["depiction"]["value"]] if profile["depiction"]
|
127
|
+
end
|
128
|
+
graph
|
129
|
+
end
|
130
|
+
|
131
|
+
def influences
|
132
|
+
@current_query = "influences graph"
|
133
|
+
sparql = "
|
134
|
+
#{self.dbpedia_sparql_prefixes}
|
135
|
+
|
136
|
+
SELECT DISTINCT ?influence ?influenceGivenName ?influenceSurname ?influenceSameAs ?influenceLabel
|
137
|
+
WHERE {
|
138
|
+
{
|
139
|
+
?influence dbo:influenced <#{self.dbpedia_uri}> .
|
140
|
+
}
|
141
|
+
UNION
|
142
|
+
{
|
143
|
+
<#{self.dbpedia_uri}> dbo:influencedBy ?influence .
|
144
|
+
}
|
145
|
+
?influence rdfs:label ?influenceLabel .
|
146
|
+
OPTIONAL {
|
147
|
+
?influence foaf:givenName ?influenceGivenName .
|
148
|
+
?influence foaf:surname ?influenceSurname .
|
149
|
+
}
|
150
|
+
OPTIONAL {
|
151
|
+
?influence owl:sameAs ?influenceSameAs .
|
152
|
+
FILTER regex(STR(?influenceSameAs), \"viaf.org\").
|
153
|
+
}
|
154
|
+
FILTER (lang(?influenceLabel) = 'en')
|
155
|
+
}
|
156
|
+
"
|
157
|
+
get_data(sparql, :dbpedia)
|
158
|
+
end
|
159
|
+
|
160
|
+
def influenced
|
161
|
+
@current_query = "influence upon graph"
|
162
|
+
sparql = "
|
163
|
+
#{self.dbpedia_sparql_prefixes}
|
164
|
+
|
165
|
+
SELECT ?influenced ?influencedGivenName ?influencedSurname ?influencedSameAs ?influencedLabel
|
166
|
+
WHERE {
|
167
|
+
{
|
168
|
+
<#{self.dbpedia_uri}> dbo:influenced ?influenced .
|
169
|
+
}
|
170
|
+
UNION
|
171
|
+
{
|
172
|
+
?influenced dbo:influencedBy <#{self.dbpedia_uri}> .
|
173
|
+
}
|
174
|
+
?influenced rdfs:label ?influencedLabel .
|
175
|
+
OPTIONAL {
|
176
|
+
?influenced foaf:givenName ?influencedGivenName .
|
177
|
+
?influenced foaf:surname ?influencedSurname .
|
178
|
+
}
|
179
|
+
OPTIONAL {
|
180
|
+
?influenced owl:sameAs ?influencedSameAs .
|
181
|
+
FILTER regex(STR(?influencedSameAs), \"viaf.org\").
|
182
|
+
}
|
183
|
+
FILTER (lang(?influencedLabel) = 'en')
|
184
|
+
}
|
185
|
+
"
|
186
|
+
get_data(sparql, :dbpedia)
|
187
|
+
end
|
188
|
+
|
189
|
+
def dbpedia_profile
|
190
|
+
sparql = "
|
191
|
+
#{self.dbpedia_sparql_prefixes}
|
192
|
+
|
193
|
+
SELECT ?abstract ?foundedDate ?location ?thumbnail ?depiction
|
194
|
+
WHERE {
|
195
|
+
OPTIONAL { <#{self.dbpedia_uri}> dbo:abstract ?abstract . }
|
196
|
+
OPTIONAL {<#{self.dbpedia_uri}> dbp:location ?location . }
|
197
|
+
OPTIONAL { <#{self.dbpedia_uri}> dbp:foundedDate ?foundedDate . }
|
198
|
+
OPTIONAL { <#{self.dbpedia_uri}> dbo:thumbnail ?thumbnail . }
|
199
|
+
OPTIONAL { <#{self.dbpedia_uri}> foaf:depiction ?depiction . }
|
200
|
+
FILTER(langMatches(lang(?abstract), \"en\"))
|
201
|
+
}
|
202
|
+
"
|
203
|
+
get_data(sparql, :dbpedia).first
|
204
|
+
end
|
205
|
+
|
206
|
+
def film_appearances
|
207
|
+
sparql = "
|
208
|
+
#{self.dbpedia_sparql_prefixes}
|
209
|
+
|
210
|
+
SELECT ?film ?filmName ?filmAbstract
|
211
|
+
WHERE {
|
212
|
+
?film dbo:starring <#{self.dbpedia_uri}> .
|
213
|
+
?film rdfs:label ?filmName .
|
214
|
+
?film dbo:abstract ?filmAbstract .
|
215
|
+
FILTER(langMatches(lang(?filmName), \"en\"))
|
216
|
+
FILTER(langMatches(lang(?filmAbstract), \"en\"))
|
217
|
+
}
|
218
|
+
"
|
219
|
+
get_data(sparql, :dbpedia)
|
220
|
+
end
|
221
|
+
|
222
|
+
def getty_note_graph
|
223
|
+
@current_query = "getty note graph"
|
224
|
+
graph = RDF::Graph.new
|
225
|
+
begin
|
226
|
+
getty_subject = self.getty_uri
|
227
|
+
self.getty_scope_notes.each do |scope_note|
|
228
|
+
# Add the scope note itself
|
229
|
+
scope_note_uri = RDF::URI.new(scope_note["scopeNote"]["value"])
|
230
|
+
graph << [getty_subject, SKOS_SCOPE_NOTE, scope_note_uri]
|
231
|
+
graph << [scope_note_uri, RDF.value, scope_note["scopeNoteValue"]["value"]]
|
232
|
+
|
233
|
+
# Add the sources/citations for the scope note
|
234
|
+
source_uri = RDF::URI.new(scope_note["source"]["value"])
|
235
|
+
graph << [scope_note_uri, DC_SOURCE, source_uri]
|
236
|
+
if scope_note["sourceShortTitle"]
|
237
|
+
graph << [source_uri, BIBO_SHORT_TITLE, scope_note["sourceShortTitle"]["value"]]
|
238
|
+
else
|
239
|
+
parent_uri = RDF::URI.new(scope_note["parent"]["value"])
|
240
|
+
graph << [source_uri, DC_IS_PART_OF, parent_uri]
|
241
|
+
graph << [source_uri, RDF.type, BIBO_DOCUMENT_PART]
|
242
|
+
graph << [parent_uri, BIBO_SHORT_TITLE, scope_note["parentShortTitle"]["value"]]
|
243
|
+
end
|
244
|
+
end
|
245
|
+
rescue RestClient::RequestTimeout
|
246
|
+
BibCard.logger.warn "Getty failed to respond. SPARQL query request timed out after 5 seconds for #{@current_query}."
|
247
|
+
rescue Exception => e
|
248
|
+
BibCard.logger.warn "Getty failed to respond. Processing data for SPARQL request: #{@current_query}. Error: #{e.message}"
|
249
|
+
end
|
250
|
+
graph
|
251
|
+
end
|
252
|
+
|
253
|
+
def getty_scope_notes
|
254
|
+
sparql = "
|
255
|
+
PREFIX ulan: <http://vocab.getty.edu/ulan/>
|
256
|
+
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
|
257
|
+
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
|
258
|
+
PREFIX dct: <http://purl.org/dc/terms/>
|
259
|
+
PREFIX bibo: <http://purl.org/ontology/bibo/>
|
260
|
+
|
261
|
+
SELECT ?scopeNote ?scopeNoteValue ?source ?sourceShortTitle ?parent ?parentShortTitle
|
262
|
+
WHERE {
|
263
|
+
<#{self.getty_uri.to_s}> skos:scopeNote ?scopeNote .
|
264
|
+
?scopeNote rdf:value ?scopeNoteValue .
|
265
|
+
?scopeNote dct:source ?source .
|
266
|
+
OPTIONAL { ?source bibo:shortTitle ?sourceShortTitle . }
|
267
|
+
OPTIONAL {
|
268
|
+
?source dct:isPartOf ?parent .
|
269
|
+
?parent bibo:shortTitle ?parentShortTitle .
|
270
|
+
}
|
271
|
+
}
|
272
|
+
"
|
273
|
+
get_data(sparql, :getty)
|
274
|
+
end
|
275
|
+
|
276
|
+
def wikidata_graph
|
277
|
+
graph = RDF::Graph.new
|
278
|
+
begin
|
279
|
+
wikidata_subject = self.wikidata_uri
|
280
|
+
self.alma_maters.each do |alma_mater|
|
281
|
+
@current_query = "alma maters graph"
|
282
|
+
am_inst_uri = RDF::URI.new(alma_mater["inst"]["value"])
|
283
|
+
am_edu_stmt = RDF::URI.new(alma_mater["statement"]["value"])
|
284
|
+
|
285
|
+
graph << [wikidata_subject, WDT_EDUCATED_AT, am_inst_uri]
|
286
|
+
graph << [am_inst_uri, RDF::RDFS.label, alma_mater["instLabel"]["value"]]
|
287
|
+
graph << [wikidata_subject, WDP_EDUCATED_AT, am_edu_stmt]
|
288
|
+
graph << [am_edu_stmt, WDPS_STMT_EDU_AT, am_inst_uri]
|
289
|
+
|
290
|
+
# Not all assertions have references/citations
|
291
|
+
if alma_mater["reference"]
|
292
|
+
am_stmt_ref = RDF::URI.new(alma_mater["reference"]["value"])
|
293
|
+
am_ref_source = RDF::URI.new(alma_mater["source"]["value"])
|
294
|
+
|
295
|
+
graph << [am_edu_stmt, PROV_DERIVED_FROM, am_stmt_ref]
|
296
|
+
graph << [am_stmt_ref, WDR_STATED_IN, am_ref_source]
|
297
|
+
graph << [am_ref_source, RDF::RDFS.label, alma_mater["sourceLabel"]["value"]]
|
298
|
+
end
|
299
|
+
end
|
300
|
+
|
301
|
+
bio = self.brief_bio
|
302
|
+
if bio
|
303
|
+
@current_query = "brief bio graph"
|
304
|
+
graph << [wikidata_subject, SCHEMA_DESCRIPTION, bio["description"]["value"]] if bio["description"]
|
305
|
+
if bio["workLocation"]
|
306
|
+
work_loc_uri = RDF::URI.new(bio["workLocation"]["value"])
|
307
|
+
graph << [wikidata_subject, WDT_WORK_LOCATION, work_loc_uri]
|
308
|
+
graph << [work_loc_uri, RDF::RDFS.label, bio["workLocationLabel"]["value"]]
|
309
|
+
end
|
310
|
+
end
|
311
|
+
|
312
|
+
self.notable_works.each do |work|
|
313
|
+
@current_query = "notable works graph"
|
314
|
+
work_uri = RDF::URI.new(work["notableWork"]["value"])
|
315
|
+
graph << [wikidata_subject, WDT_NOTABLE_WORKS, work_uri]
|
316
|
+
graph << [work_uri, RDF::RDFS.label, work["notableWorkLabel"]["value"]]
|
317
|
+
graph << [work_uri, WDT_ISBN, work["isbn"]["value"]] if work["isbn"]
|
318
|
+
graph << [work_uri, WDT_OCLC_NUMBER, work["oclcNumber"]["value"]] if work["oclcNumber"]
|
319
|
+
end
|
320
|
+
rescue RestClient::RequestTimeout
|
321
|
+
BibCard.logger.warn "WikiData failed to respond. SPARQL query request timed out after 5 seconds for #{@current_query}."
|
322
|
+
rescue Exception => e
|
323
|
+
BibCard.logger.warn "WikiData failed to respond. Processing data for SPARQL request: #{@current_query}. Error: #{e.message}"
|
324
|
+
end
|
325
|
+
graph
|
326
|
+
end
|
327
|
+
|
328
|
+
def alma_maters
|
329
|
+
sparql = "
|
330
|
+
#{self.wikidata_sparql_prefixes}
|
331
|
+
|
332
|
+
SELECT DISTINCT ?inst ?instLabel ?statement ?reference ?source ?sourceLabel
|
333
|
+
WHERE
|
334
|
+
{
|
335
|
+
<#{self.wikidata_uri.to_s}> p:P69 ?statement .
|
336
|
+
?statement ps:P69 ?inst .
|
337
|
+
?inst rdfs:label ?instLabel .
|
338
|
+
FILTER(langMatches(lang(?instLabel), \"en\"))
|
339
|
+
OPTIONAL {
|
340
|
+
?statement prov:wasDerivedFrom ?reference .
|
341
|
+
?reference pref:P248 ?source .
|
342
|
+
?source rdfs:label ?sourceLabel .
|
343
|
+
FILTER(langMatches(lang(?sourceLabel), \"en\"))
|
344
|
+
}
|
345
|
+
}
|
346
|
+
"
|
347
|
+
get_data(sparql, :wikidata)
|
348
|
+
end
|
349
|
+
|
350
|
+
def brief_bio
|
351
|
+
sparql = "
|
352
|
+
#{self.wikidata_sparql_prefixes}
|
353
|
+
|
354
|
+
SELECT DISTINCT ?description ?workLocation ?workLocationLabel
|
355
|
+
WHERE
|
356
|
+
{
|
357
|
+
<#{self.wikidata_uri.to_s}> schema:description ?description .
|
358
|
+
OPTIONAL {
|
359
|
+
<#{self.wikidata_uri.to_s}> wdt:P937 ?workLocation .
|
360
|
+
}
|
361
|
+
SERVICE wikibase:label {
|
362
|
+
bd:serviceParam wikibase:language \"en\" .
|
363
|
+
}
|
364
|
+
FILTER(langMatches(lang(?description), \"en\"))
|
365
|
+
}
|
366
|
+
"
|
367
|
+
get_data(sparql, :wikidata).first
|
368
|
+
end
|
369
|
+
|
370
|
+
def notable_works
|
371
|
+
sparql = "
|
372
|
+
#{self.wikidata_sparql_prefixes}
|
373
|
+
|
374
|
+
SELECT DISTINCT ?notableWork ?notableWorkLabel ?isbn ?oclcNumber
|
375
|
+
WHERE
|
376
|
+
{
|
377
|
+
<#{self.wikidata_uri.to_s}> wdt:P800 ?notableWork .
|
378
|
+
OPTIONAL {
|
379
|
+
?notableWork wdt:P212 ?isbn .
|
380
|
+
?notableWork wdt:P243 ?oclcNumber .
|
381
|
+
}
|
382
|
+
SERVICE wikibase:label {
|
383
|
+
bd:serviceParam wikibase:language \"en\" .
|
384
|
+
}
|
385
|
+
}
|
386
|
+
"
|
387
|
+
notable_works = get_data(sparql, :wikidata)
|
388
|
+
notable_works.select {|work| work["notableWorkLabel"] != nil and !work["notableWorkLabel"]["value"].match(/^Q\d+$/)}
|
389
|
+
end
|
390
|
+
|
391
|
+
protected
|
392
|
+
|
393
|
+
def get_data(sparql, source)
|
394
|
+
url = SPARQL_ENDPOINTS[source] + URI::encode_www_form_component(sparql.gsub(/\n/, ' '))
|
395
|
+
data = RestClient::Request.execute(method: :get, url: url, headers: {accept: "application/sparql-results+json"}, timeout: 5)
|
396
|
+
parsed_data = JSON.parse data
|
397
|
+
parsed_data["results"]["bindings"]
|
398
|
+
end
|
399
|
+
|
400
|
+
def wikidata_sparql_prefixes
|
401
|
+
"
|
402
|
+
PREFIX wikibase: <http://wikiba.se/ontology#>
|
403
|
+
PREFIX p: <http://www.wikidata.org/prop/>
|
404
|
+
PREFIX pref: <http://www.wikidata.org/prop/reference/>
|
405
|
+
PREFIX ps: <http://www.wikidata.org/prop/statement/>
|
406
|
+
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
|
407
|
+
"
|
408
|
+
end
|
409
|
+
|
410
|
+
def dbpedia_sparql_prefixes
|
411
|
+
"
|
412
|
+
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
|
413
|
+
PREFIX owl: <http://www.w3.org/2002/07/owl#>
|
414
|
+
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
|
415
|
+
PREFIX dbo: <http://dbpedia.org/ontology/>
|
416
|
+
"
|
417
|
+
end
|
418
|
+
|
419
|
+
end
|
420
|
+
end
|