rdf-microdata 2.2.2 → 3.1.3
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +5 -5
- data/README.md +25 -27
- data/UNLICENSE +1 -1
- data/VERSION +1 -1
- data/etc/doap.html +9 -9
- data/etc/doap.nt +19 -19
- data/etc/doap.ttl +20 -21
- data/etc/registry.json +5 -0
- data/lib/rdf/microdata.rb +3 -4
- data/lib/rdf/microdata/expansion.rb +2 -3
- data/lib/rdf/microdata/format.rb +4 -30
- data/lib/rdf/microdata/rdfa_reader.rb +6 -17
- data/lib/rdf/microdata/reader.rb +7 -15
- data/lib/rdf/microdata/reader/nokogiri.rb +7 -5
- data/lib/rdf/microdata/registry.rb +1 -1
- metadata +51 -47
- data/lib/rdf/microdata/jsonld_reader.rb +0 -251
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
|
-
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
2
|
+
SHA256:
|
3
|
+
metadata.gz: 2c364957ee6b1b8981d6aa274346df645609a702ebcba5718cbd4c096344125d
|
4
|
+
data.tar.gz: edeabdc8a3bcfc2df7c2aa25b0e60485bcaee87a5c15429cf81c0d47386c4fd8
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 23c53c41cbe0203ee765b46095060f1e65374a4af3f14ba3e0e8dbf1ed1a4e465d25dd12ae2419bf5eeb4d896834a3e38a740ba6004698da915e59746541272c
|
7
|
+
data.tar.gz: 64f6284ce624f9144d202479047d002b8ceeacc8abef2128741970500f44e963eeb3251224de6085414c58d47a8d6c72025ddd38df23a7da75b21f08b48caba4
|
data/README.md
CHANGED
@@ -2,8 +2,10 @@
|
|
2
2
|
|
3
3
|
[Microdata][] parser for RDF.rb.
|
4
4
|
|
5
|
-
[![Gem Version](https://badge.fury.io/rb/rdf-microdata.png)](
|
6
|
-
[![Build Status](https://
|
5
|
+
[![Gem Version](https://badge.fury.io/rb/rdf-microdata.png)](https://badge.fury.io/rb/rdf-microdata)
|
6
|
+
[![Build Status](https://github.com/ruby-rdf/rdf-microdata/workflows/CI/badge.svg?branch=develop)](https://github.com/ruby-rdf/rdf-microdata/actions?query=workflow%3ACI)
|
7
|
+
[![Coverage Status](https://coveralls.io/repos/ruby-rdf/rdf-microdata/badge.svg?branch=develop)](https://coveralls.io/github/ruby-rdf/rdf-microdata?branch=develop)
|
8
|
+
[![Gitter chat](https://badges.gitter.im/ruby-rdf/rdf.png)](https://gitter.im/ruby-rdf/rdf)
|
7
9
|
|
8
10
|
## DESCRIPTION
|
9
11
|
RDF::Microdata is a Microdata reader for Ruby using the [RDF.rb][RDF.rb] library suite.
|
@@ -45,11 +47,12 @@ GRDDL-type triple generation, such as for html>head>title anchor tags.
|
|
45
47
|
If the `RDFa` parser is available, {RDF::Microdata::Format} will not assert content type `text/html` or file extension `.html`, as this is also asserted by RDFa. Instead, the RDFa reader will invoke the microdata reader if an `@itemscope` attribute is detected.
|
46
48
|
|
47
49
|
## Dependencies
|
48
|
-
* [RDF.rb](
|
49
|
-
* [RDF::
|
50
|
-
* [
|
51
|
-
* [
|
52
|
-
*
|
50
|
+
* [RDF.rb](https://rubygems.org/gems/rdf) (~> 3.1)
|
51
|
+
* [RDF::RDFa](https://rubygems.org/gems/rdf-xsd) (~> 3.1)
|
52
|
+
* [RDF::XSD](https://rubygems.org/gems/rdf-xsd) (~> 3.1)
|
53
|
+
* [HTMLEntities](https://rubygems.org/gems/htmlentities) ('~> 4.3')
|
54
|
+
* [Nokogiri](https://rubygems.org/gems/nokogiri) (~> 1.10)
|
55
|
+
* Soft dependency on [Nokogumbo](https://github.com/rubys/nokogumbo) (~> 2.0)
|
53
56
|
|
54
57
|
## Documentation
|
55
58
|
Full documentation available on [Rubydoc.info][Microdata doc]
|
@@ -68,22 +71,15 @@ use {RDF::Microdata::RdfaReader} directly.
|
|
68
71
|
|
69
72
|
The reader exposes a `#rdfa` method, which can be used to retrieve the transformed HTML+RDFa
|
70
73
|
|
71
|
-
### JSON-lD-based Reader
|
72
|
-
There is an experimental reader based on transforming Microdata to JSON-LD. To invoke
|
73
|
-
this, add the `jsonld: true` option to the {RDF::Microdata::Reader.new}, or
|
74
|
-
use {RDF::Microdata::JsonLdReader} directly.
|
75
|
-
|
76
|
-
The reader exposes a `#json` method, which can be used to retrieve the generated JSON-LD
|
77
|
-
|
78
74
|
## Resources
|
79
75
|
* [RDF.rb][RDF.rb]
|
80
|
-
* [Documentation](
|
76
|
+
* [Documentation](https://www.rubydoc.info/github/ruby-rdf/rdf-microdata/)
|
81
77
|
* [History](file:History.md)
|
82
78
|
* [Microdata][]
|
83
79
|
* [Microdata RDF][]
|
84
80
|
|
85
81
|
## Author
|
86
|
-
* [Gregg Kellogg](
|
82
|
+
* [Gregg Kellogg](https://github.com/gkellogg) - <https://greggkellogg.net/>
|
87
83
|
|
88
84
|
## Contributing
|
89
85
|
|
@@ -97,25 +93,27 @@ The reader exposes a `#json` method, which can be used to retrieve the generated
|
|
97
93
|
list in the the `README`. Alphabetical order applies.
|
98
94
|
* Do note that in order for us to merge any non-trivial changes (as a rule
|
99
95
|
of thumb, additions larger than about 15 lines of code), we need an
|
100
|
-
explicit [public domain dedication][PDD] on record from you
|
96
|
+
explicit [public domain dedication][PDD] on record from you,
|
97
|
+
which you will be asked to agree to on the first commit to a repo within the organization.
|
98
|
+
Note that the agreement applies to all repos in the [Ruby RDF](https://github.com/ruby-rdf/) organization.
|
101
99
|
|
102
100
|
## License
|
103
101
|
|
104
102
|
This is free and unencumbered public domain software. For more information,
|
105
|
-
see <
|
103
|
+
see <https://unlicense.org/> or the accompanying {file:UNLICENSE} file.
|
106
104
|
|
107
105
|
## FEEDBACK
|
108
106
|
|
109
107
|
* gregg@greggkellogg.net
|
110
|
-
* <
|
111
|
-
* <
|
112
|
-
* <
|
108
|
+
* <https://rubygems.org/rdf-microdata>
|
109
|
+
* <https://github.com/ruby-rdf/rdf-microdata>
|
110
|
+
* <https://lists.w3.org/Archives/Public/public-rdf-ruby/>
|
113
111
|
|
114
112
|
[RDF.rb]: https://github.com/ruby-rdf/rdf
|
115
|
-
[YARD]:
|
116
|
-
[YARD-GS]:
|
117
|
-
[PDD]:
|
118
|
-
[Microdata]:
|
119
|
-
[Microdata RDF]:
|
120
|
-
[Microdata doc]:
|
113
|
+
[YARD]: https://yardoc.org/
|
114
|
+
[YARD-GS]: https://rubydoc.info/docs/yard/file/docs/GettingStarted.md
|
115
|
+
[PDD]: https://unlicense.org/#unlicensing-contributions
|
116
|
+
[Microdata]: https://dev.w3.org/html5/md/Overview.html "HTML Microdata"
|
117
|
+
[Microdata RDF]: https://dvcs.w3.org/hg/htmldata/raw-file/default/microdata-rdf/index.html "Microdata to RDF"
|
118
|
+
[Microdata doc]: https://rubydoc.info/github/ruby-rdf/rdf-microdata/frames
|
121
119
|
[Nokogumbo]: https://github.com/rubys/nokogumbo/#readme
|
data/UNLICENSE
CHANGED
@@ -21,4 +21,4 @@ OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
|
|
21
21
|
ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
|
22
22
|
OTHER DEALINGS IN THE SOFTWARE.
|
23
23
|
|
24
|
-
For more information, please refer to <
|
24
|
+
For more information, please refer to <https://unlicense.org/1.0/>
|
data/VERSION
CHANGED
@@ -1 +1 @@
|
|
1
|
-
|
1
|
+
3.1.3
|
data/etc/doap.html
CHANGED
@@ -3,7 +3,7 @@
|
|
3
3
|
<head>
|
4
4
|
<title lang="en" itemprop="shortdesc">Microdata reader for Ruby.</title>
|
5
5
|
<meta http-equiv="Content-Type" content="text/html;charset=utf-8" >
|
6
|
-
<base href="
|
6
|
+
<base href="https://rubygems.org/gems/rdf-microdata">
|
7
7
|
</head>
|
8
8
|
<body>
|
9
9
|
<p>Project description for <span itemprop="name">RDF::Microdata</span>.</p>
|
@@ -18,26 +18,26 @@
|
|
18
18
|
<dt>Created</dt><dd><time itemprop="created" datetime="2011-08-29">2011-08-29</time></dd>
|
19
19
|
<dt>Blog</dt><dd><a href="http://greggkellogg.net/" itemprop="blog">http://greggkellogg.net/</a></dd>
|
20
20
|
<dt>Bug DB</dt><dd>
|
21
|
-
<a href="
|
22
|
-
|
21
|
+
<a href="https://github.com/ruby-rdf/rdf-microdata/issues" itemprop="bug-database">
|
22
|
+
https://github.com/ruby-rdf/rdf-microdata/issues
|
23
23
|
</a>
|
24
24
|
</dd>
|
25
25
|
<dt>Category</dt><dd>
|
26
26
|
<a itemprop="category" href="http://dbpedia.org/resource/Resource_Description_Framework">Resource Description Framework</a>
|
27
27
|
for
|
28
|
-
<
|
28
|
+
<span itemprop="programming-language">Ruby</span>
|
29
29
|
</dd>
|
30
30
|
<dt>Implements</dt><dd>
|
31
31
|
<a itemprop="implements" href="http://www.w3.org/TR/microdata-rdf/">Microdata to RDF</a>
|
32
32
|
</dd>
|
33
|
-
<dt>Download</dt><dd><a href="
|
34
|
-
|
33
|
+
<dt>Download</dt><dd><a href="https://rubygems.org/gems/rdf-microdata" itemprop="download-page">
|
34
|
+
https://rubygems.org/gems/rdf-microdata
|
35
35
|
</a></dd>
|
36
|
-
<dt>Home Page</dt><dd><a href="
|
37
|
-
|
36
|
+
<dt>Home Page</dt><dd><a href="https://github.com/ruby-rdf/rdf-microdata" itemprop="homepage">
|
37
|
+
https://github.com/ruby-rdf/rdf-microdata
|
38
38
|
</a></dd>
|
39
39
|
<dt>License</dt><dd>
|
40
|
-
<a href="
|
40
|
+
<a href="https://unlicense.org/1.0/" itemprop="license">Public Domain</a>
|
41
41
|
</dd>
|
42
42
|
<dt>Mailing List</dt><dd><a href="http://lists.w3.org/Archives/Public/public-rdf-ruby/" itemprop="mailing-list">
|
43
43
|
http://lists.w3.org/Archives/Public/public-rdf-ruby/
|
data/etc/doap.nt
CHANGED
@@ -1,19 +1,19 @@
|
|
1
|
-
<
|
2
|
-
<
|
3
|
-
<
|
4
|
-
<
|
5
|
-
<
|
6
|
-
<
|
7
|
-
<
|
8
|
-
<
|
9
|
-
<
|
10
|
-
<
|
11
|
-
<
|
12
|
-
<
|
13
|
-
<
|
14
|
-
<
|
15
|
-
<
|
16
|
-
<
|
17
|
-
<
|
18
|
-
<
|
19
|
-
<
|
1
|
+
<https://rubygems.org/gems/rdf-microdata> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://usefulinc.com/ns/doap#Project> .
|
2
|
+
<https://rubygems.org/gems/rdf-microdata> <http://usefulinc.com/ns/doap#shortdesc> "Microdata reader for Ruby."@en .
|
3
|
+
<https://rubygems.org/gems/rdf-microdata> <http://usefulinc.com/ns/doap#description> "\n RDF::Microdata is an Microdata reader for Ruby using the RDF.rb library suite.\n "@en .
|
4
|
+
<https://rubygems.org/gems/rdf-microdata> <http://usefulinc.com/ns/doap#name> "RDF::Microdata" .
|
5
|
+
<https://rubygems.org/gems/rdf-microdata> <http://purl.org/dc/terms/creator> <http://greggkellogg.net/foaf#me> .
|
6
|
+
<https://rubygems.org/gems/rdf-microdata> <http://usefulinc.com/ns/doap#developer> <http://greggkellogg.net/foaf#me> .
|
7
|
+
<https://rubygems.org/gems/rdf-microdata> <http://usefulinc.com/ns/doap#documenter> <http://greggkellogg.net/foaf#me> .
|
8
|
+
<https://rubygems.org/gems/rdf-microdata> <http://usefulinc.com/ns/doap#maintainer> <http://greggkellogg.net/foaf#me> .
|
9
|
+
<https://rubygems.org/gems/rdf-microdata> <http://xmlns.com/foaf/0.1/creator> <http://greggkellogg.net/foaf#me> .
|
10
|
+
<https://rubygems.org/gems/rdf-microdata> <http://usefulinc.com/ns/doap#created> "2011-08-29"^^<http://www.w3.org/2001/XMLSchema#date> .
|
11
|
+
<https://rubygems.org/gems/rdf-microdata> <http://usefulinc.com/ns/doap#blog> <http://greggkellogg.net/> .
|
12
|
+
<https://rubygems.org/gems/rdf-microdata> <http://usefulinc.com/ns/doap#bug-database> <https://github.com/ruby-rdf/rdf-microdata/issues> .
|
13
|
+
<https://rubygems.org/gems/rdf-microdata> <http://usefulinc.com/ns/doap#category> <http://dbpedia.org/resource/Resource_Description_Framework> .
|
14
|
+
<https://rubygems.org/gems/rdf-microdata> <http://usefulinc.com/ns/doap#programming-language> "Ruby" .
|
15
|
+
<https://rubygems.org/gems/rdf-microdata> <http://usefulinc.com/ns/doap#implements> <http://www.w3.org/TR/microdata-rdf/> .
|
16
|
+
<https://rubygems.org/gems/rdf-microdata> <http://usefulinc.com/ns/doap#download-page> <https://rubygems.org/gems/rdf-microdata> .
|
17
|
+
<https://rubygems.org/gems/rdf-microdata> <http://usefulinc.com/ns/doap#homepage> <https://github.com/ruby-rdf/rdf-microdata> .
|
18
|
+
<https://rubygems.org/gems/rdf-microdata> <http://usefulinc.com/ns/doap#license> <https://unlicense.org/1.0/> .
|
19
|
+
<https://rubygems.org/gems/rdf-microdata> <http://usefulinc.com/ns/doap#mailing-list> <http://lists.w3.org/Archives/Public/public-rdf-ruby/> .
|
data/etc/doap.ttl
CHANGED
@@ -1,27 +1,26 @@
|
|
1
|
-
@
|
1
|
+
@base <https://rubygems.org/gems/rdf-microdata> .
|
2
|
+
@prefix dc: <http://purl.org/dc/terms/> .
|
2
3
|
@prefix doap: <http://usefulinc.com/ns/doap#> .
|
3
4
|
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
|
4
|
-
@prefix rdf:
|
5
|
-
@prefix xsd:
|
5
|
+
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
|
6
|
+
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
|
6
7
|
|
7
|
-
|
8
|
-
|
9
|
-
doap:
|
10
|
-
doap:
|
8
|
+
<> a doap:Project;
|
9
|
+
doap:name "RDF::Microdata";
|
10
|
+
doap:shortdesc "Microdata reader for Ruby RDF.rb."@en;
|
11
|
+
doap:description "RDF::Microdata is an Microdata reader for Ruby using the RDF.rb library suite."@en;
|
12
|
+
dc:creator <https://greggkellogg.net/foaf#me>;
|
13
|
+
doap:blog <https://greggkellogg.net/>;
|
14
|
+
doap:bug-database <https://github.com/ruby-rdf/rdf-microdata/issues>;
|
11
15
|
doap:category <http://dbpedia.org/resource/Resource_Description_Framework>;
|
12
16
|
doap:created "2011-08-29"^^xsd:date;
|
13
|
-
doap:
|
14
|
-
|
15
|
-
|
16
|
-
doap:
|
17
|
-
doap:documenter <http://greggkellogg.net/foaf#me>;
|
18
|
-
doap:download-page <http://rubygems.org/gems/rdf-microdata>;
|
19
|
-
doap:homepage <http://github.com/ruby-rdf/rdf-microdata>;
|
17
|
+
doap:developer <https://greggkellogg.net/foaf#me>;
|
18
|
+
doap:documenter <https://greggkellogg.net/foaf#me>;
|
19
|
+
doap:download-page <>;
|
20
|
+
doap:homepage <https://github.com/ruby-rdf/rdf-microdata>;
|
20
21
|
doap:implements <http://www.w3.org/TR/microdata-rdf/>;
|
21
|
-
doap:license <
|
22
|
-
doap:mailing-list <
|
23
|
-
doap:maintainer <
|
24
|
-
doap:
|
25
|
-
|
26
|
-
doap:shortdesc "Microdata reader for Ruby."@en;
|
27
|
-
foaf:creator <http://greggkellogg.net/foaf#me> .
|
22
|
+
doap:license <https://unlicense.org/1.0/>;
|
23
|
+
doap:mailing-list <https://lists.w3.org/Archives/Public/public-rdf-ruby/>;
|
24
|
+
doap:maintainer <https://greggkellogg.net/foaf#me>;
|
25
|
+
doap:programming-language "Ruby";
|
26
|
+
foaf:creator <https://greggkellogg.net/foaf#me> .
|
data/etc/registry.json
CHANGED
@@ -4,5 +4,10 @@
|
|
4
4
|
"additionalType": {"subPropertyOf": "http://www.w3.org/1999/02/22-rdf-syntax-ns#type"}
|
5
5
|
}
|
6
6
|
},
|
7
|
+
"https://schema.org/": {
|
8
|
+
"properties": {
|
9
|
+
"additionalType": {"subPropertyOf": "http://www.w3.org/1999/02/22-rdf-syntax-ns#type"}
|
10
|
+
}
|
11
|
+
},
|
7
12
|
"http://microformats.org/profile/hcard": {}
|
8
13
|
}
|
data/lib/rdf/microdata.rb
CHANGED
@@ -15,10 +15,10 @@ module RDF
|
|
15
15
|
# end
|
16
16
|
# end
|
17
17
|
#
|
18
|
-
# @see
|
19
|
-
# @see
|
18
|
+
# @see https://www.rubydoc.info/github/ruby-rdf/rdf/
|
19
|
+
# @see https://www.w3.org/TR/2011/WD-microdata-20110525/
|
20
20
|
#
|
21
|
-
# @author [Gregg Kellogg](
|
21
|
+
# @author [Gregg Kellogg](https://greggkellogg.net/)
|
22
22
|
module Microdata
|
23
23
|
USES_VOCAB = RDF::URI("http://www.w3.org/ns/rdfa#usesVocabulary")
|
24
24
|
DEFAULT_REGISTRY = File.expand_path("../../../etc/registry.json", __FILE__)
|
@@ -26,7 +26,6 @@ module RDF
|
|
26
26
|
require 'rdf/microdata/format'
|
27
27
|
require 'rdf/microdata/vocab'
|
28
28
|
autoload :Expansion, 'rdf/microdata/expansion'
|
29
|
-
autoload :JsonLdReader, 'rdf/microdata/jsonld_reader'
|
30
29
|
autoload :Profile, 'rdf/microdata/profile'
|
31
30
|
autoload :RdfaReader, 'rdf/microdata/rdfa_reader'
|
32
31
|
autoload :Reader, 'rdf/microdata/reader'
|
@@ -26,7 +26,6 @@ module RDF::Microdata
|
|
26
26
|
repo = RDF::Repository.new
|
27
27
|
repo << self # Add default graph
|
28
28
|
|
29
|
-
count = repo.count
|
30
29
|
log_debug("expand") {"Loaded #{repo.size} triples into default graph"}
|
31
30
|
|
32
31
|
repo = owl_entailment(repo)
|
@@ -38,7 +37,7 @@ module RDF::Microdata
|
|
38
37
|
end
|
39
38
|
|
40
39
|
def rule(name, &block)
|
41
|
-
Rule.new(name,
|
40
|
+
Rule.new(name, **@options, &block)
|
42
41
|
end
|
43
42
|
|
44
43
|
##
|
@@ -72,7 +71,7 @@ module RDF::Microdata
|
|
72
71
|
# r.execute(queryable) {|statement| puts statement.inspect}
|
73
72
|
#
|
74
73
|
# @param [String] name
|
75
|
-
def initialize(name, options
|
74
|
+
def initialize(name, **options, &block)
|
76
75
|
@antecedents = []
|
77
76
|
@consequents = []
|
78
77
|
@options = options.dup
|
data/lib/rdf/microdata/format.rb
CHANGED
@@ -19,7 +19,7 @@ module RDF::Microdata
|
|
19
19
|
# @example Obtaining serialization format MIME types
|
20
20
|
# RDF::Format.content_types #=> {"text/html" => [RDF::Microdata::Format]}
|
21
21
|
#
|
22
|
-
# @see
|
22
|
+
# @see https://www.w3.org/TR/rdf-testcases/#ntriples
|
23
23
|
class Format < RDF::Format
|
24
24
|
content_encoding 'utf-8'
|
25
25
|
|
@@ -55,7 +55,7 @@ module RDF::Microdata
|
|
55
55
|
format: :microdata
|
56
56
|
},
|
57
57
|
option_use: {output_format: :disabled},
|
58
|
-
lambda: ->(files, options) do
|
58
|
+
lambda: ->(files, **options) do
|
59
59
|
out = options[:output] || $stdout
|
60
60
|
xsl = Nokogiri::XSLT(%(<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
|
61
61
|
<xsl:param name="indent-increment" select="' '"/>
|
@@ -107,7 +107,7 @@ module RDF::Microdata
|
|
107
107
|
# If files are empty, either use options[::evaluate]
|
108
108
|
input = options[:evaluate] ? StringIO.new(options[:evaluate]) : STDIN
|
109
109
|
input.set_encoding(options.fetch(:encoding, Encoding::UTF_8))
|
110
|
-
RDF::Microdata::Reader.new(input, options.merge(rdfa: true)) do |reader|
|
110
|
+
RDF::Microdata::Reader.new(input, **options.merge(rdfa: true)) do |reader|
|
111
111
|
reader.rdfa.xpath("//text()").each do |txt|
|
112
112
|
txt.content = txt.content.to_s.strip
|
113
113
|
end
|
@@ -115,7 +115,7 @@ module RDF::Microdata
|
|
115
115
|
end
|
116
116
|
else
|
117
117
|
files.each do |file|
|
118
|
-
RDF::Microdata::Reader.open(file, options.merge(rdfa: true)) do |reader|
|
118
|
+
RDF::Microdata::Reader.open(file, **options.merge(rdfa: true)) do |reader|
|
119
119
|
reader.rdfa.xpath("//text()").each do |txt|
|
120
120
|
txt.content = txt.content.to_s.strip
|
121
121
|
end
|
@@ -125,32 +125,6 @@ module RDF::Microdata
|
|
125
125
|
end
|
126
126
|
end
|
127
127
|
},
|
128
|
-
"to-jsonld": {
|
129
|
-
description: "Transform HTML+Microdata into JSON-LD",
|
130
|
-
parse: false,
|
131
|
-
help: "to-jsonld files ...\nTransform HTML+Microdata into JSON-LD",
|
132
|
-
filter: {
|
133
|
-
format: :microdata
|
134
|
-
},
|
135
|
-
option_use: {output_format: :disabled},
|
136
|
-
lambda: ->(files, options) do
|
137
|
-
out = options[:output] || $stdout
|
138
|
-
if files.empty?
|
139
|
-
# If files are empty, either use options[::evaluate]
|
140
|
-
input = options[:evaluate] ? StringIO.new(options[:evaluate]) : STDIN
|
141
|
-
input.set_encoding(options.fetch(:encoding, Encoding::UTF_8))
|
142
|
-
RDF::Microdata::Reader.new(input, options.merge(jsonld: true)) do |reader|
|
143
|
-
out.puts reader.jsonld.to_json(::JSON::LD::JSON_STATE)
|
144
|
-
end
|
145
|
-
else
|
146
|
-
files.each do |file|
|
147
|
-
RDF::Microdata::Reader.open(file, options.merge(jsonld: true)) do |reader|
|
148
|
-
out.puts reader.jsonld.to_json(::JSON::LD::JSON_STATE)
|
149
|
-
end
|
150
|
-
end
|
151
|
-
end
|
152
|
-
end
|
153
|
-
},
|
154
128
|
}
|
155
129
|
end
|
156
130
|
end
|
@@ -29,7 +29,7 @@ module RDF::Microdata
|
|
29
29
|
# @yieldparam [RDF::Reader] reader
|
30
30
|
# @yieldreturn [void] ignored
|
31
31
|
# @raise [RDF::ReaderError] if _validate_
|
32
|
-
def initialize(input = $stdin, options
|
32
|
+
def initialize(input = $stdin, **options, &block)
|
33
33
|
@options = options
|
34
34
|
log_debug('', "using RDFa transformation reader")
|
35
35
|
|
@@ -46,15 +46,6 @@ module RDF::Microdata
|
|
46
46
|
::Nokogiri::HTML5(input.force_encoding(options[:encoding]))
|
47
47
|
end
|
48
48
|
|
49
|
-
# Load registry
|
50
|
-
begin
|
51
|
-
registry_uri = options[:registry] || RDF::Microdata::DEFAULT_REGISTRY
|
52
|
-
log_debug('', "registry = #{registry_uri.inspect}")
|
53
|
-
Registry.load_registry(registry_uri)
|
54
|
-
rescue JSON::ParserError => e
|
55
|
-
log_fatal("Failed to parse registry: #{e.message}", exception: RDF::ReaderError) if (root.nil? && validate?)
|
56
|
-
end
|
57
|
-
|
58
49
|
# For all members having @itemscope
|
59
50
|
input.css("[itemscope]").each do |item|
|
60
51
|
# Get @itemtypes to create @type and @vocab
|
@@ -69,21 +60,19 @@ module RDF::Microdata
|
|
69
60
|
|
70
61
|
item['typeof'] = types.join(' ') unless types.empty?
|
71
62
|
if vocab = types.first
|
72
|
-
vocab =
|
73
|
-
type_vocab = vocab.to_s.sub(/([\/\#])[^\/\#]*$/, '\1')
|
63
|
+
vocab = begin
|
64
|
+
type_vocab = vocab.to_s.sub(/([\/\#])[^\/\#]*$/, '\1')
|
74
65
|
Registry.new(type_vocab) if type_vocab
|
75
66
|
end
|
76
67
|
item['vocab'] = vocab.uri.to_s if vocab
|
77
68
|
end
|
78
69
|
end
|
70
|
+
item['typeof'] ||= ''
|
79
71
|
|
80
72
|
# Change each itemid attribute to an resource attribute with the same value
|
81
73
|
if item['itemid']
|
82
74
|
id = item.attribute('itemid').remove
|
83
|
-
item[
|
84
|
-
else
|
85
|
-
# Otherwise, ensure that @typeof has at least an empty value
|
86
|
-
item['typeof'] ||= ''
|
75
|
+
item['resource'] = id
|
87
76
|
end
|
88
77
|
end
|
89
78
|
|
@@ -126,7 +115,7 @@ module RDF::Microdata
|
|
126
115
|
version: :"rdfa1.1")
|
127
116
|
|
128
117
|
# Rely on RDFa reader
|
129
|
-
super(input, options, &block)
|
118
|
+
super(input, **options, &block)
|
130
119
|
end
|
131
120
|
end
|
132
121
|
end
|
data/lib/rdf/microdata/reader.rb
CHANGED
@@ -8,8 +8,8 @@ module RDF::Microdata
|
|
8
8
|
#
|
9
9
|
# Based on processing rules, amended with the following:
|
10
10
|
#
|
11
|
-
# @see
|
12
|
-
# @author [Gregg Kellogg](
|
11
|
+
# @see https://dvcs.w3.org/hg/htmldata/raw-file/0d6b89f5befb/microdata-rdf/index.html
|
12
|
+
# @author [Gregg Kellogg](https://greggkellogg.net/)
|
13
13
|
class Reader < RDF::Reader
|
14
14
|
format Format
|
15
15
|
include Expansion
|
@@ -39,7 +39,7 @@ module RDF::Microdata
|
|
39
39
|
|
40
40
|
##
|
41
41
|
# Reader options
|
42
|
-
# @see
|
42
|
+
# @see https://www.rubydoc.info/github/ruby-rdf/rdf/RDF/Reader#options-class_method
|
43
43
|
def self.options
|
44
44
|
super + [
|
45
45
|
RDF::CLI::Option.new(
|
@@ -54,7 +54,7 @@ module RDF::Microdata
|
|
54
54
|
# Redirect for RDFa Reader given `:rdfa` option
|
55
55
|
#
|
56
56
|
# @private
|
57
|
-
def self.new(input = nil, options
|
57
|
+
def self.new(input = nil, **options, &block)
|
58
58
|
klass = if options[:rdfa]
|
59
59
|
# Requires rdf-rdfa gem to be loaded
|
60
60
|
begin
|
@@ -63,19 +63,11 @@ module RDF::Microdata
|
|
63
63
|
raise ReaderError, "Use of RDFa-based reader requires rdf-rdfa gem"
|
64
64
|
end
|
65
65
|
RdfaReader
|
66
|
-
elsif options[:jsonld]
|
67
|
-
# Requires rdf-rdfa gem to be loaded
|
68
|
-
begin
|
69
|
-
require 'json/ld'
|
70
|
-
rescue LoadError
|
71
|
-
raise ReaderError, "Use of JSON-LD-based reader requires json-ld gem"
|
72
|
-
end
|
73
|
-
JsonLdReader
|
74
66
|
else
|
75
67
|
self
|
76
68
|
end
|
77
69
|
reader = klass.allocate
|
78
|
-
reader.send(:initialize, input, options, &block)
|
70
|
+
reader.send(:initialize, input, **options, &block)
|
79
71
|
reader
|
80
72
|
end
|
81
73
|
|
@@ -102,7 +94,7 @@ module RDF::Microdata
|
|
102
94
|
# @yieldparam [RDF::Reader] reader
|
103
95
|
# @yieldreturn [void] ignored
|
104
96
|
# @raise [Error] Raises `RDF::ReaderError` when validating
|
105
|
-
def initialize(input = $stdin, options
|
97
|
+
def initialize(input = $stdin, **options, &block)
|
106
98
|
super do
|
107
99
|
@library = :nokogiri
|
108
100
|
|
@@ -111,7 +103,7 @@ module RDF::Microdata
|
|
111
103
|
self.extend(@implementation)
|
112
104
|
|
113
105
|
input.rewind if input.respond_to?(:rewind)
|
114
|
-
initialize_html(input, options) rescue log_fatal($!.message, exception: RDF::ReaderError)
|
106
|
+
initialize_html(input, **options) rescue log_fatal($!.message, exception: RDF::ReaderError)
|
115
107
|
|
116
108
|
log_error("Empty document") if root.nil?
|
117
109
|
log_error(doc_errors.map(&:message).uniq.join("\n")) if !doc_errors.empty?
|
@@ -3,7 +3,7 @@ module RDF::Microdata
|
|
3
3
|
##
|
4
4
|
# Nokogiri implementation of an HTML parser.
|
5
5
|
#
|
6
|
-
# @see
|
6
|
+
# @see https://nokogiri.org/
|
7
7
|
module Nokogiri
|
8
8
|
##
|
9
9
|
# Returns the name of the underlying XML library.
|
@@ -178,7 +178,7 @@ module RDF::Microdata
|
|
178
178
|
#
|
179
179
|
# @param [Hash{Symbol => Object}] options
|
180
180
|
# @return [void]
|
181
|
-
def initialize_html(input, options
|
181
|
+
def initialize_html(input, **options)
|
182
182
|
require 'nokogiri' unless defined?(::Nokogiri)
|
183
183
|
@doc = case input
|
184
184
|
when ::Nokogiri::XML::Document
|
@@ -194,7 +194,7 @@ module RDF::Microdata
|
|
194
194
|
begin
|
195
195
|
require 'nokogumbo' unless defined?(::Nokogumbo)
|
196
196
|
input = input.read if input.respond_to?(:read)
|
197
|
-
::Nokogiri::HTML5(input.force_encoding(options[:encoding]))
|
197
|
+
::Nokogiri::HTML5(input.force_encoding(options[:encoding]), max_parse_errors: 1000)
|
198
198
|
rescue LoadError
|
199
199
|
::Nokogiri::HTML.parse(input, base_uri.to_s, options[:encoding])
|
200
200
|
end
|
@@ -212,7 +212,9 @@ module RDF::Microdata
|
|
212
212
|
##
|
213
213
|
# Document errors
|
214
214
|
def doc_errors
|
215
|
-
@doc.errors.reject
|
215
|
+
@doc.errors.reject do |e|
|
216
|
+
e.to_s =~ %r{(The doctype must be the first token in the document)|(Expected a doctype token)|(Unexpected '\?' where start tag name is expected)}
|
217
|
+
end
|
216
218
|
end
|
217
219
|
|
218
220
|
##
|
@@ -230,7 +232,7 @@ module RDF::Microdata
|
|
230
232
|
##
|
231
233
|
# Based on Microdata element.getItems
|
232
234
|
#
|
233
|
-
# @see
|
235
|
+
# @see https://www.w3.org/TR/2011/WD-microdata-20110525/#top-level-microdata-items
|
234
236
|
def getItems
|
235
237
|
@doc.css('[itemscope]').select {|el| !el.has_attribute?('itemprop')}.map {|n| NodeProxy.new(n)}
|
236
238
|
end
|
@@ -55,7 +55,7 @@ module RDF::Microdata
|
|
55
55
|
# Generate a predicateURI given a `name`
|
56
56
|
#
|
57
57
|
# @param [#to_s] name
|
58
|
-
# @param [
|
58
|
+
# @param [RDF::URI] base_uri base URI for resolving `name`.
|
59
59
|
# @return [RDF::URI]
|
60
60
|
def predicateURI(name, base_uri)
|
61
61
|
u = RDF::URI(name)
|
metadata
CHANGED
@@ -1,15 +1,15 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: rdf-microdata
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version:
|
4
|
+
version: 3.1.3
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Gregg
|
8
8
|
- Kellogg
|
9
|
-
autorequire:
|
9
|
+
autorequire:
|
10
10
|
bindir: bin
|
11
11
|
cert_chain: []
|
12
|
-
date:
|
12
|
+
date: 2021-03-15 00:00:00.000000000 Z
|
13
13
|
dependencies:
|
14
14
|
- !ruby/object:Gem::Dependency
|
15
15
|
name: rdf
|
@@ -17,34 +17,54 @@ dependencies:
|
|
17
17
|
requirements:
|
18
18
|
- - "~>"
|
19
19
|
- !ruby/object:Gem::Version
|
20
|
-
version: '
|
20
|
+
version: '3.1'
|
21
21
|
- - ">="
|
22
22
|
- !ruby/object:Gem::Version
|
23
|
-
version:
|
23
|
+
version: 3.1.13
|
24
24
|
type: :runtime
|
25
25
|
prerelease: false
|
26
26
|
version_requirements: !ruby/object:Gem::Requirement
|
27
27
|
requirements:
|
28
28
|
- - "~>"
|
29
29
|
- !ruby/object:Gem::Version
|
30
|
-
version: '
|
30
|
+
version: '3.1'
|
31
31
|
- - ">="
|
32
32
|
- !ruby/object:Gem::Version
|
33
|
-
version:
|
33
|
+
version: 3.1.13
|
34
|
+
- !ruby/object:Gem::Dependency
|
35
|
+
name: rdf-rdfa
|
36
|
+
requirement: !ruby/object:Gem::Requirement
|
37
|
+
requirements:
|
38
|
+
- - "~>"
|
39
|
+
- !ruby/object:Gem::Version
|
40
|
+
version: '3.1'
|
41
|
+
- - ">="
|
42
|
+
- !ruby/object:Gem::Version
|
43
|
+
version: 3.1.3
|
44
|
+
type: :runtime
|
45
|
+
prerelease: false
|
46
|
+
version_requirements: !ruby/object:Gem::Requirement
|
47
|
+
requirements:
|
48
|
+
- - "~>"
|
49
|
+
- !ruby/object:Gem::Version
|
50
|
+
version: '3.1'
|
51
|
+
- - ">="
|
52
|
+
- !ruby/object:Gem::Version
|
53
|
+
version: 3.1.3
|
34
54
|
- !ruby/object:Gem::Dependency
|
35
55
|
name: rdf-xsd
|
36
56
|
requirement: !ruby/object:Gem::Requirement
|
37
57
|
requirements:
|
38
58
|
- - "~>"
|
39
59
|
- !ruby/object:Gem::Version
|
40
|
-
version: '
|
60
|
+
version: '3.1'
|
41
61
|
type: :runtime
|
42
62
|
prerelease: false
|
43
63
|
version_requirements: !ruby/object:Gem::Requirement
|
44
64
|
requirements:
|
45
65
|
- - "~>"
|
46
66
|
- !ruby/object:Gem::Version
|
47
|
-
version: '
|
67
|
+
version: '3.1'
|
48
68
|
- !ruby/object:Gem::Dependency
|
49
69
|
name: htmlentities
|
50
70
|
requirement: !ruby/object:Gem::Requirement
|
@@ -65,14 +85,14 @@ dependencies:
|
|
65
85
|
requirements:
|
66
86
|
- - "~>"
|
67
87
|
- !ruby/object:Gem::Version
|
68
|
-
version: '1.
|
88
|
+
version: '1.10'
|
69
89
|
type: :runtime
|
70
90
|
prerelease: false
|
71
91
|
version_requirements: !ruby/object:Gem::Requirement
|
72
92
|
requirements:
|
73
93
|
- - "~>"
|
74
94
|
- !ruby/object:Gem::Version
|
75
|
-
version: '1.
|
95
|
+
version: '1.10'
|
76
96
|
- !ruby/object:Gem::Dependency
|
77
97
|
name: equivalent-xml
|
78
98
|
requirement: !ruby/object:Gem::Requirement
|
@@ -107,98 +127,84 @@ dependencies:
|
|
107
127
|
requirements:
|
108
128
|
- - "~>"
|
109
129
|
- !ruby/object:Gem::Version
|
110
|
-
version: '3.
|
130
|
+
version: '3.10'
|
111
131
|
type: :development
|
112
132
|
prerelease: false
|
113
133
|
version_requirements: !ruby/object:Gem::Requirement
|
114
134
|
requirements:
|
115
135
|
- - "~>"
|
116
136
|
- !ruby/object:Gem::Version
|
117
|
-
version: '3.
|
137
|
+
version: '3.10'
|
118
138
|
- !ruby/object:Gem::Dependency
|
119
139
|
name: rspec-its
|
120
140
|
requirement: !ruby/object:Gem::Requirement
|
121
141
|
requirements:
|
122
142
|
- - "~>"
|
123
143
|
- !ruby/object:Gem::Version
|
124
|
-
version: '1.
|
125
|
-
type: :development
|
126
|
-
prerelease: false
|
127
|
-
version_requirements: !ruby/object:Gem::Requirement
|
128
|
-
requirements:
|
129
|
-
- - "~>"
|
130
|
-
- !ruby/object:Gem::Version
|
131
|
-
version: '1.2'
|
132
|
-
- !ruby/object:Gem::Dependency
|
133
|
-
name: json-ld
|
134
|
-
requirement: !ruby/object:Gem::Requirement
|
135
|
-
requirements:
|
136
|
-
- - "~>"
|
137
|
-
- !ruby/object:Gem::Version
|
138
|
-
version: '2.1'
|
144
|
+
version: '1.3'
|
139
145
|
type: :development
|
140
146
|
prerelease: false
|
141
147
|
version_requirements: !ruby/object:Gem::Requirement
|
142
148
|
requirements:
|
143
149
|
- - "~>"
|
144
150
|
- !ruby/object:Gem::Version
|
145
|
-
version: '
|
151
|
+
version: '1.3'
|
146
152
|
- !ruby/object:Gem::Dependency
|
147
153
|
name: rdf-spec
|
148
154
|
requirement: !ruby/object:Gem::Requirement
|
149
155
|
requirements:
|
150
156
|
- - "~>"
|
151
157
|
- !ruby/object:Gem::Version
|
152
|
-
version: '
|
158
|
+
version: '3.1'
|
153
159
|
type: :development
|
154
160
|
prerelease: false
|
155
161
|
version_requirements: !ruby/object:Gem::Requirement
|
156
162
|
requirements:
|
157
163
|
- - "~>"
|
158
164
|
- !ruby/object:Gem::Version
|
159
|
-
version: '
|
165
|
+
version: '3.1'
|
160
166
|
- !ruby/object:Gem::Dependency
|
161
|
-
name: rdf-
|
167
|
+
name: rdf-turtle
|
162
168
|
requirement: !ruby/object:Gem::Requirement
|
163
169
|
requirements:
|
164
170
|
- - "~>"
|
165
171
|
- !ruby/object:Gem::Version
|
166
|
-
version: '
|
172
|
+
version: '3.1'
|
167
173
|
type: :development
|
168
174
|
prerelease: false
|
169
175
|
version_requirements: !ruby/object:Gem::Requirement
|
170
176
|
requirements:
|
171
177
|
- - "~>"
|
172
178
|
- !ruby/object:Gem::Version
|
173
|
-
version: '
|
179
|
+
version: '3.1'
|
174
180
|
- !ruby/object:Gem::Dependency
|
175
|
-
name: rdf-
|
181
|
+
name: rdf-isomorphic
|
176
182
|
requirement: !ruby/object:Gem::Requirement
|
177
183
|
requirements:
|
178
184
|
- - "~>"
|
179
185
|
- !ruby/object:Gem::Version
|
180
|
-
version: '
|
186
|
+
version: '3.1'
|
181
187
|
type: :development
|
182
188
|
prerelease: false
|
183
189
|
version_requirements: !ruby/object:Gem::Requirement
|
184
190
|
requirements:
|
185
191
|
- - "~>"
|
186
192
|
- !ruby/object:Gem::Version
|
187
|
-
version: '
|
193
|
+
version: '3.1'
|
188
194
|
- !ruby/object:Gem::Dependency
|
189
|
-
name:
|
195
|
+
name: json-ld
|
190
196
|
requirement: !ruby/object:Gem::Requirement
|
191
197
|
requirements:
|
192
198
|
- - "~>"
|
193
199
|
- !ruby/object:Gem::Version
|
194
|
-
version: '
|
200
|
+
version: '3.1'
|
195
201
|
type: :development
|
196
202
|
prerelease: false
|
197
203
|
version_requirements: !ruby/object:Gem::Requirement
|
198
204
|
requirements:
|
199
205
|
- - "~>"
|
200
206
|
- !ruby/object:Gem::Version
|
201
|
-
version: '
|
207
|
+
version: '3.1'
|
202
208
|
description: Reads HTML Microdata as RDF.
|
203
209
|
email: public-rdf-ruby@w3.org
|
204
210
|
executables: []
|
@@ -216,18 +222,17 @@ files:
|
|
216
222
|
- lib/rdf/microdata.rb
|
217
223
|
- lib/rdf/microdata/expansion.rb
|
218
224
|
- lib/rdf/microdata/format.rb
|
219
|
-
- lib/rdf/microdata/jsonld_reader.rb
|
220
225
|
- lib/rdf/microdata/rdfa_reader.rb
|
221
226
|
- lib/rdf/microdata/reader.rb
|
222
227
|
- lib/rdf/microdata/reader/nokogiri.rb
|
223
228
|
- lib/rdf/microdata/registry.rb
|
224
229
|
- lib/rdf/microdata/version.rb
|
225
230
|
- lib/rdf/microdata/vocab.rb
|
226
|
-
homepage:
|
231
|
+
homepage: https://ruby-rdf.github.com/rdf-microdata
|
227
232
|
licenses:
|
228
233
|
- Unlicense
|
229
234
|
metadata: {}
|
230
|
-
post_install_message:
|
235
|
+
post_install_message:
|
231
236
|
rdoc_options: []
|
232
237
|
require_paths:
|
233
238
|
- lib
|
@@ -235,16 +240,15 @@ required_ruby_version: !ruby/object:Gem::Requirement
|
|
235
240
|
requirements:
|
236
241
|
- - ">="
|
237
242
|
- !ruby/object:Gem::Version
|
238
|
-
version: 2.
|
243
|
+
version: '2.4'
|
239
244
|
required_rubygems_version: !ruby/object:Gem::Requirement
|
240
245
|
requirements:
|
241
246
|
- - ">="
|
242
247
|
- !ruby/object:Gem::Version
|
243
248
|
version: '0'
|
244
249
|
requirements: []
|
245
|
-
|
246
|
-
|
247
|
-
signing_key:
|
250
|
+
rubygems_version: 3.2.3
|
251
|
+
signing_key:
|
248
252
|
specification_version: 4
|
249
253
|
summary: Microdata reader for Ruby.
|
250
254
|
test_files: []
|
@@ -1,251 +0,0 @@
|
|
1
|
-
require 'json/ld'
|
2
|
-
require 'nokogumbo'
|
3
|
-
|
4
|
-
module RDF::Microdata
|
5
|
-
##
|
6
|
-
# Update DOM to turn Microdata into JSON-LD and parse using the JSON-LD Reader
|
7
|
-
class JsonLdReader < JSON::LD::Reader
|
8
|
-
# The resulting JSON-LD
|
9
|
-
# @return [Hash]
|
10
|
-
attr_reader :jsonld
|
11
|
-
|
12
|
-
def self.format(klass = nil)
|
13
|
-
if klass.nil?
|
14
|
-
RDF::Microdata::Format
|
15
|
-
else
|
16
|
-
super
|
17
|
-
end
|
18
|
-
end
|
19
|
-
|
20
|
-
##
|
21
|
-
# Initializes the JsonLdReader instance.
|
22
|
-
#
|
23
|
-
# @param [IO, File, String] input
|
24
|
-
# the input stream to read
|
25
|
-
# @param [Hash{Symbol => Object}] options
|
26
|
-
# any additional options (see `RDF::Reader#initialize`)
|
27
|
-
# @return [reader]
|
28
|
-
# @yield [reader] `self`
|
29
|
-
# @yieldparam [RDF::Reader] reader
|
30
|
-
# @yieldreturn [void] ignored
|
31
|
-
# @raise [RDF::ReaderError] if _validate_
|
32
|
-
def initialize(input = $stdin, options = {}, &block)
|
33
|
-
@options = options
|
34
|
-
log_debug('', "using JSON-LD transformation reader")
|
35
|
-
|
36
|
-
input = case input
|
37
|
-
when ::Nokogiri::XML::Document, ::Nokogiri::HTML::Document then input
|
38
|
-
else
|
39
|
-
# Try to detect charset from input
|
40
|
-
options[:encoding] ||= input.charset if input.respond_to?(:charset)
|
41
|
-
|
42
|
-
# Otherwise, default is utf-8
|
43
|
-
options[:encoding] ||= 'utf-8'
|
44
|
-
options[:encoding] = options[:encoding].to_s if options[:encoding]
|
45
|
-
input = input.read if input.respond_to?(:read)
|
46
|
-
::Nokogiri::HTML5(input.force_encoding(options[:encoding]))
|
47
|
-
end
|
48
|
-
|
49
|
-
# Load registry
|
50
|
-
begin
|
51
|
-
registry_uri = options[:registry] || RDF::Microdata::DEFAULT_REGISTRY
|
52
|
-
log_debug('', "registry = #{registry_uri.inspect}")
|
53
|
-
Registry.load_registry(registry_uri)
|
54
|
-
rescue JSON::ParserError => e
|
55
|
-
log_fatal("Failed to parse registry: #{e.message}", exception: RDF::ReaderError) if (root.nil? && validate?)
|
56
|
-
end
|
57
|
-
|
58
|
-
@jsonld = {'@graph' => []}
|
59
|
-
|
60
|
-
# Start with all top-level items
|
61
|
-
input.css("[itemscope]").each do |item|
|
62
|
-
next if item['itemprop'] # Only top-level items
|
63
|
-
jsonld['@graph'] << get_object(item)
|
64
|
-
end
|
65
|
-
|
66
|
-
log_debug('', "Transformed document: #{jsonld.to_json(JSON::LD::JSON_STATE)}")
|
67
|
-
|
68
|
-
# Rely on RDFa reader
|
69
|
-
super(jsonld.to_json, options, &block)
|
70
|
-
end
|
71
|
-
|
72
|
-
private
|
73
|
-
# Return JSON-LD representation of an item
|
74
|
-
# @param [Nokogiri::XML::Element] item
|
75
|
-
# @param [Hash{Nokogiri::XML::Node => Hash}]
|
76
|
-
# @return [Hash]
|
77
|
-
def get_object(item, memory = {})
|
78
|
-
if result = memory[item]
|
79
|
-
# Result is a reference to that item; assign a blank-node identifier if necessary
|
80
|
-
result['@id'] ||= alloc_bnode
|
81
|
-
return result
|
82
|
-
end
|
83
|
-
|
84
|
-
result = {}
|
85
|
-
memory[item] = result
|
86
|
-
|
87
|
-
# If the item has a global identifier, add an entry to result called "@id" whose value is the global identifier of item.
|
88
|
-
result['@id'] = item['itemid'].to_s if item['itemid']
|
89
|
-
|
90
|
-
# If the item has any item types, add an entry to result called "@type" whose value is an array listing the item types of item, in the order they were specified on the itemtype attribute.
|
91
|
-
if item['itemtype']
|
92
|
-
# Only absolute URLs
|
93
|
-
types = item.attribute('itemtype').
|
94
|
-
remove.
|
95
|
-
to_s.
|
96
|
-
split(/\s+/).
|
97
|
-
select {|t| RDF::URI(t).absolute?}
|
98
|
-
if vocab = types.first
|
99
|
-
vocab = Registry.find(vocab) || begin
|
100
|
-
type_vocab = vocab.to_s.sub(/([\/\#])[^\/\#]*$/, '\1') unless vocab.nil?
|
101
|
-
Registry.new(type_vocab) if type_vocab
|
102
|
-
end
|
103
|
-
(result['@context'] = {})['@vocab'] = vocab.uri.to_s if vocab
|
104
|
-
result['@type'] = types unless types.empty?
|
105
|
-
end
|
106
|
-
end
|
107
|
-
|
108
|
-
# For each element element that has one or more property names and is one of the properties of the item item, in the order those elements are given by the algorithm that returns the properties of an item, run the following substeps
|
109
|
-
item_properties(item).each do |element|
|
110
|
-
value = if element['itemscope']
|
111
|
-
get_object(element, memory)
|
112
|
-
else
|
113
|
-
property_value(element)
|
114
|
-
end
|
115
|
-
element['itemprop'].to_s.split(/\s+/).each do |prop|
|
116
|
-
result[prop] ||= [] << value
|
117
|
-
end
|
118
|
-
end
|
119
|
-
|
120
|
-
result
|
121
|
-
end
|
122
|
-
|
123
|
-
##
|
124
|
-
#
|
125
|
-
# @param [Nokogiri::XML::Element] item
|
126
|
-
# @return [Array<Nokogiri::XML::Element>]
|
127
|
-
# List of property elements for an item
|
128
|
-
def item_properties(item)
|
129
|
-
results, memory, pending = [], [item], item.children.select(&:element?)
|
130
|
-
log_debug(item, "item_properties")
|
131
|
-
|
132
|
-
# If root has an itemref attribute, split the value of that itemref attribute on spaces. For each resulting token ID, if there is an element in the document whose ID is ID, then add the first such element to pending.
|
133
|
-
item['itemref'].to_s.split(/\s+/).each do |ref|
|
134
|
-
if referenced = referenced = item.at_css("##{ref}")
|
135
|
-
pending << referenced
|
136
|
-
end
|
137
|
-
end
|
138
|
-
|
139
|
-
while !pending.empty?
|
140
|
-
current = pending.shift
|
141
|
-
# Error
|
142
|
-
break if memory.include?(current)
|
143
|
-
memory << current
|
144
|
-
|
145
|
-
# If current does not have an itemscope attribute, then: add all the child elements of current to pending.
|
146
|
-
pending += current.children.select(&:element?) unless current['itemscope']
|
147
|
-
|
148
|
-
# If current has an itemprop attribute specified and has one or more property names, then add current to results.
|
149
|
-
results << current unless current['itemprop'].to_s.split(/\s+/).empty?
|
150
|
-
end
|
151
|
-
|
152
|
-
results
|
153
|
-
end
|
154
|
-
|
155
|
-
##
|
156
|
-
#
|
157
|
-
def property_value(element)
|
158
|
-
base = element.base || base_uri
|
159
|
-
log_debug(element) {"property_value(#{element.name}): base #{base.inspect}"}
|
160
|
-
value = case
|
161
|
-
when element.has_attribute?('itemscope')
|
162
|
-
{}
|
163
|
-
when element.has_attribute?('content')
|
164
|
-
if element.language
|
165
|
-
{"@value" => element['content'].to_s.strip, language: element.language}
|
166
|
-
else
|
167
|
-
element['content'].to_s.strip
|
168
|
-
end
|
169
|
-
when %w(data meter).include?(element.name) && element.attribute('value')
|
170
|
-
# XXX parse as number?
|
171
|
-
{"@value" => element['value'].to_s.strip}
|
172
|
-
when %w(audio embed iframe img source track video).include?(element.name)
|
173
|
-
{"@id" => uri(element.attribute('src'), base).to_s}
|
174
|
-
when %w(a area link).include?(element.name)
|
175
|
-
{"@id" => uri(element.attribute('href'), base).to_s}
|
176
|
-
when %w(object).include?(element.name)
|
177
|
-
{"@id" => uri(element.attribute('data'), base).to_s}
|
178
|
-
when %w(time).include?(element.name)
|
179
|
-
# use datatype?
|
180
|
-
(element.attribute('datetime') || element.text).to_s.strip
|
181
|
-
else
|
182
|
-
if element.language
|
183
|
-
{"@value" => element.inner_text.to_s.strip, language: element.language}
|
184
|
-
else
|
185
|
-
element.inner_text.to_s.strip
|
186
|
-
end
|
187
|
-
end
|
188
|
-
log_debug(element) {" #{value.inspect}"}
|
189
|
-
value
|
190
|
-
end
|
191
|
-
|
192
|
-
# Allocate a new blank node identifier
|
193
|
-
# @return [String]
|
194
|
-
def alloc_bnode
|
195
|
-
@bnode_base ||= "_:a"
|
196
|
-
res = @bnode_base
|
197
|
-
@bnode_base = res.succ
|
198
|
-
res
|
199
|
-
end
|
200
|
-
|
201
|
-
# Fixme, what about xml:base relative to element?
|
202
|
-
def uri(value, base = nil)
|
203
|
-
value = if base
|
204
|
-
base = uri(base) unless base.is_a?(RDF::URI)
|
205
|
-
base.join(value.to_s)
|
206
|
-
else
|
207
|
-
RDF::URI(value.to_s)
|
208
|
-
end
|
209
|
-
value.validate! if validate?
|
210
|
-
value.canonicalize! if canonicalize?
|
211
|
-
value = RDF::URI.intern(value) if intern?
|
212
|
-
value
|
213
|
-
end
|
214
|
-
end
|
215
|
-
end
|
216
|
-
|
217
|
-
# Monkey Patch Nokogiri
|
218
|
-
module Nokogiri::XML
|
219
|
-
class Element
|
220
|
-
|
221
|
-
##
|
222
|
-
# Get any xml:base in effect for this element
|
223
|
-
def base
|
224
|
-
if @base.nil?
|
225
|
-
@base = attributes['xml:base'] ||
|
226
|
-
(parent && parent.element? && parent.base) ||
|
227
|
-
false
|
228
|
-
end
|
229
|
-
|
230
|
-
@base == false ? nil : @base
|
231
|
-
end
|
232
|
-
|
233
|
-
|
234
|
-
##
|
235
|
-
# Get any xml:lang or lang in effect for this element
|
236
|
-
def language
|
237
|
-
if @language.nil?
|
238
|
-
language = case
|
239
|
-
when self["xml:lang"]
|
240
|
-
self["xml:lang"].to_s
|
241
|
-
when self["lang"]
|
242
|
-
self["lang"].to_s
|
243
|
-
else
|
244
|
-
parent && parent.element? && parent.language
|
245
|
-
end
|
246
|
-
end
|
247
|
-
@language == false ? nil : @language
|
248
|
-
end
|
249
|
-
|
250
|
-
end
|
251
|
-
end
|