rdf-microdata 2.2.0 → 3.1.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +5 -5
- data/README.md +25 -19
- data/UNLICENSE +1 -1
- data/VERSION +1 -1
- data/etc/doap.html +9 -9
- data/etc/doap.nt +19 -19
- data/etc/doap.ttl +17 -16
- data/lib/rdf/microdata.rb +10 -7
- data/lib/rdf/microdata/expansion.rb +2 -3
- data/lib/rdf/microdata/format.rb +87 -1
- data/lib/rdf/microdata/rdfa_reader.rb +121 -0
- data/lib/rdf/microdata/reader.rb +73 -160
- data/lib/rdf/microdata/reader/nokogiri.rb +13 -5
- data/lib/rdf/microdata/registry.rb +109 -0
- metadata +44 -29
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
|
-
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
2
|
+
SHA256:
|
3
|
+
metadata.gz: 5cf57a5354b695916b671d885b9a9b024ef084b0c3ec2ae34bcf7668264f2c27
|
4
|
+
data.tar.gz: 8e3e2c9c8a212e77bd652b76998246bbf9ef019e85be8c18001fa4d9db887b79
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: e1ad276f98f0ef1b83fabb7b37b8d96ec7df2f4c139fedd1681be98e9eccc62e65e7c363aecb46a0208c979ab99fc7d882343b408624d19396801d5cd2894ffb
|
7
|
+
data.tar.gz: 6343a17bca689225145a1f6bcd6364ab284b47ba1c8dac0c4bf64578df14955f08bf003500a7938127cdcc3b40b2460c865f49a3ed0f4eb6fcec6cb02692ac2c
|
data/README.md
CHANGED
@@ -2,8 +2,8 @@
|
|
2
2
|
|
3
3
|
[Microdata][] parser for RDF.rb.
|
4
4
|
|
5
|
-
[](
|
6
|
-
[](
|
5
|
+
[](https://badge.fury.io/rb/rdf-microdata)
|
6
|
+
[](https://travis-ci.org/ruby-rdf/rdf-microdata)
|
7
7
|
|
8
8
|
## DESCRIPTION
|
9
9
|
RDF::Microdata is a Microdata reader for Ruby using the [RDF.rb][RDF.rb] library suite.
|
@@ -45,11 +45,11 @@ GRDDL-type triple generation, such as for html>head>title anchor tags.
|
|
45
45
|
If the `RDFa` parser is available, {RDF::Microdata::Format} will not assert content type `text/html` or file extension `.html`, as this is also asserted by RDFa. Instead, the RDFa reader will invoke the microdata reader if an `@itemscope` attribute is detected.
|
46
46
|
|
47
47
|
## Dependencies
|
48
|
-
* [RDF.rb](
|
49
|
-
* [RDF::XSD](
|
48
|
+
* [RDF.rb](https://rubygems.org/gems/rdf) (>= 3.1)
|
49
|
+
* [RDF::XSD](https://rubygems.org/gems/rdf-xsd) (>= 3.1)
|
50
50
|
* [HTMLEntities](https://rubygems.org/gems/htmlentities) ('>= 4.3.0')
|
51
|
-
* [Nokogiri](
|
52
|
-
* Soft dependency on [Nokogumbo](https://github.com/rubys/nokogumbo) (
|
51
|
+
* [Nokogiri](https://rubygems.org/gems/nokogiri) (>= 1.10)
|
52
|
+
* Soft dependency on [Nokogumbo](https://github.com/rubys/nokogumbo) (~> 2.0)
|
53
53
|
|
54
54
|
## Documentation
|
55
55
|
Full documentation available on [Rubydoc.info][Microdata doc]
|
@@ -60,17 +60,23 @@ Full documentation available on [Rubydoc.info][Microdata doc]
|
|
60
60
|
* {RDF::Microdata::Reader}
|
61
61
|
* {RDF::Microdata::Reader::Nokogiri}
|
62
62
|
|
63
|
-
|
63
|
+
|
64
|
+
### RDFa-based Reader
|
65
|
+
There is an experimental reader based on transforming Microdata to RDFa within the DOM. To invoke
|
66
|
+
this, add the `rdfa: true` option to the {RDF::Microdata::Reader.new}, or
|
67
|
+
use {RDF::Microdata::RdfaReader} directly.
|
68
|
+
|
69
|
+
The reader exposes a `#rdfa` method, which can be used to retrieve the transformed HTML+RDFa
|
64
70
|
|
65
71
|
## Resources
|
66
72
|
* [RDF.rb][RDF.rb]
|
67
|
-
* [Documentation](
|
73
|
+
* [Documentation](https://www.rubydoc.info/github/ruby-rdf/rdf-microdata/)
|
68
74
|
* [History](file:History.md)
|
69
75
|
* [Microdata][]
|
70
76
|
* [Microdata RDF][]
|
71
77
|
|
72
78
|
## Author
|
73
|
-
* [Gregg Kellogg](
|
79
|
+
* [Gregg Kellogg](https://github.com/gkellogg) - <https://greggkellogg.net/>
|
74
80
|
|
75
81
|
## Contributing
|
76
82
|
|
@@ -89,20 +95,20 @@ Full documentation available on [Rubydoc.info][Microdata doc]
|
|
89
95
|
## License
|
90
96
|
|
91
97
|
This is free and unencumbered public domain software. For more information,
|
92
|
-
see <
|
98
|
+
see <https://unlicense.org/> or the accompanying {file:UNLICENSE} file.
|
93
99
|
|
94
100
|
## FEEDBACK
|
95
101
|
|
96
102
|
* gregg@greggkellogg.net
|
97
|
-
* <
|
98
|
-
* <
|
99
|
-
* <
|
103
|
+
* <https://rubygems.org/rdf-microdata>
|
104
|
+
* <https://github.com/ruby-rdf/rdf-microdata>
|
105
|
+
* <https://lists.w3.org/Archives/Public/public-rdf-ruby/>
|
100
106
|
|
101
107
|
[RDF.rb]: https://github.com/ruby-rdf/rdf
|
102
|
-
[YARD]:
|
103
|
-
[YARD-GS]:
|
104
|
-
[PDD]:
|
105
|
-
[Microdata]:
|
106
|
-
[Microdata RDF]:
|
107
|
-
[Microdata doc]:
|
108
|
+
[YARD]: https://yardoc.org/
|
109
|
+
[YARD-GS]: https://rubydoc.info/docs/yard/file/docs/GettingStarted.md
|
110
|
+
[PDD]: https://lists.w3.org/Archives/Public/public-rdf-ruby/2010May/0013.html
|
111
|
+
[Microdata]: https://dev.w3.org/html5/md/Overview.html "HTML Microdata"
|
112
|
+
[Microdata RDF]: https://dvcs.w3.org/hg/htmldata/raw-file/default/microdata-rdf/index.html "Microdata to RDF"
|
113
|
+
[Microdata doc]: https://rubydoc.info/github/ruby-rdf/rdf-microdata/frames
|
108
114
|
[Nokogumbo]: https://github.com/rubys/nokogumbo/#readme
|
data/UNLICENSE
CHANGED
@@ -21,4 +21,4 @@ OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
|
|
21
21
|
ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
|
22
22
|
OTHER DEALINGS IN THE SOFTWARE.
|
23
23
|
|
24
|
-
For more information, please refer to <
|
24
|
+
For more information, please refer to <https://unlicense.org/1.0/>
|
data/VERSION
CHANGED
@@ -1 +1 @@
|
|
1
|
-
|
1
|
+
3.1.1
|
data/etc/doap.html
CHANGED
@@ -3,7 +3,7 @@
|
|
3
3
|
<head>
|
4
4
|
<title lang="en" itemprop="shortdesc">Microdata reader for Ruby.</title>
|
5
5
|
<meta http-equiv="Content-Type" content="text/html;charset=utf-8" >
|
6
|
-
<base href="
|
6
|
+
<base href="https://rubygems.org/gems/rdf-microdata">
|
7
7
|
</head>
|
8
8
|
<body>
|
9
9
|
<p>Project description for <span itemprop="name">RDF::Microdata</span>.</p>
|
@@ -18,26 +18,26 @@
|
|
18
18
|
<dt>Created</dt><dd><time itemprop="created" datetime="2011-08-29">2011-08-29</time></dd>
|
19
19
|
<dt>Blog</dt><dd><a href="http://greggkellogg.net/" itemprop="blog">http://greggkellogg.net/</a></dd>
|
20
20
|
<dt>Bug DB</dt><dd>
|
21
|
-
<a href="
|
22
|
-
|
21
|
+
<a href="https://github.com/ruby-rdf/rdf-microdata/issues" itemprop="bug-database">
|
22
|
+
https://github.com/ruby-rdf/rdf-microdata/issues
|
23
23
|
</a>
|
24
24
|
</dd>
|
25
25
|
<dt>Category</dt><dd>
|
26
26
|
<a itemprop="category" href="http://dbpedia.org/resource/Resource_Description_Framework">Resource Description Framework</a>
|
27
27
|
for
|
28
|
-
<
|
28
|
+
<span itemprop="programming-language">Ruby</span>
|
29
29
|
</dd>
|
30
30
|
<dt>Implements</dt><dd>
|
31
31
|
<a itemprop="implements" href="http://www.w3.org/TR/microdata-rdf/">Microdata to RDF</a>
|
32
32
|
</dd>
|
33
|
-
<dt>Download</dt><dd><a href="
|
34
|
-
|
33
|
+
<dt>Download</dt><dd><a href="https://rubygems.org/gems/rdf-microdata" itemprop="download-page">
|
34
|
+
https://rubygems.org/gems/rdf-microdata
|
35
35
|
</a></dd>
|
36
|
-
<dt>Home Page</dt><dd><a href="
|
37
|
-
|
36
|
+
<dt>Home Page</dt><dd><a href="https://github.com/ruby-rdf/rdf-microdata" itemprop="homepage">
|
37
|
+
https://github.com/ruby-rdf/rdf-microdata
|
38
38
|
</a></dd>
|
39
39
|
<dt>License</dt><dd>
|
40
|
-
<a href="
|
40
|
+
<a href="https://unlicense.org/1.0/" itemprop="license">Public Domain</a>
|
41
41
|
</dd>
|
42
42
|
<dt>Mailing List</dt><dd><a href="http://lists.w3.org/Archives/Public/public-rdf-ruby/" itemprop="mailing-list">
|
43
43
|
http://lists.w3.org/Archives/Public/public-rdf-ruby/
|
data/etc/doap.nt
CHANGED
@@ -1,19 +1,19 @@
|
|
1
|
-
<
|
2
|
-
<
|
3
|
-
<
|
4
|
-
<
|
5
|
-
<
|
6
|
-
<
|
7
|
-
<
|
8
|
-
<
|
9
|
-
<
|
10
|
-
<
|
11
|
-
<
|
12
|
-
<
|
13
|
-
<
|
14
|
-
<
|
15
|
-
<
|
16
|
-
<
|
17
|
-
<
|
18
|
-
<
|
19
|
-
<
|
1
|
+
<https://rubygems.org/gems/rdf-microdata> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://usefulinc.com/ns/doap#Project> .
|
2
|
+
<https://rubygems.org/gems/rdf-microdata> <http://usefulinc.com/ns/doap#shortdesc> "Microdata reader for Ruby."@en .
|
3
|
+
<https://rubygems.org/gems/rdf-microdata> <http://usefulinc.com/ns/doap#description> "\n RDF::Microdata is an Microdata reader for Ruby using the RDF.rb library suite.\n "@en .
|
4
|
+
<https://rubygems.org/gems/rdf-microdata> <http://usefulinc.com/ns/doap#name> "RDF::Microdata" .
|
5
|
+
<https://rubygems.org/gems/rdf-microdata> <http://purl.org/dc/terms/creator> <http://greggkellogg.net/foaf#me> .
|
6
|
+
<https://rubygems.org/gems/rdf-microdata> <http://usefulinc.com/ns/doap#developer> <http://greggkellogg.net/foaf#me> .
|
7
|
+
<https://rubygems.org/gems/rdf-microdata> <http://usefulinc.com/ns/doap#documenter> <http://greggkellogg.net/foaf#me> .
|
8
|
+
<https://rubygems.org/gems/rdf-microdata> <http://usefulinc.com/ns/doap#maintainer> <http://greggkellogg.net/foaf#me> .
|
9
|
+
<https://rubygems.org/gems/rdf-microdata> <http://xmlns.com/foaf/0.1/creator> <http://greggkellogg.net/foaf#me> .
|
10
|
+
<https://rubygems.org/gems/rdf-microdata> <http://usefulinc.com/ns/doap#created> "2011-08-29"^^<http://www.w3.org/2001/XMLSchema#date> .
|
11
|
+
<https://rubygems.org/gems/rdf-microdata> <http://usefulinc.com/ns/doap#blog> <http://greggkellogg.net/> .
|
12
|
+
<https://rubygems.org/gems/rdf-microdata> <http://usefulinc.com/ns/doap#bug-database> <https://github.com/ruby-rdf/rdf-microdata/issues> .
|
13
|
+
<https://rubygems.org/gems/rdf-microdata> <http://usefulinc.com/ns/doap#category> <http://dbpedia.org/resource/Resource_Description_Framework> .
|
14
|
+
<https://rubygems.org/gems/rdf-microdata> <http://usefulinc.com/ns/doap#programming-language> "Ruby" .
|
15
|
+
<https://rubygems.org/gems/rdf-microdata> <http://usefulinc.com/ns/doap#implements> <http://www.w3.org/TR/microdata-rdf/> .
|
16
|
+
<https://rubygems.org/gems/rdf-microdata> <http://usefulinc.com/ns/doap#download-page> <https://rubygems.org/gems/rdf-microdata> .
|
17
|
+
<https://rubygems.org/gems/rdf-microdata> <http://usefulinc.com/ns/doap#homepage> <https://github.com/ruby-rdf/rdf-microdata> .
|
18
|
+
<https://rubygems.org/gems/rdf-microdata> <http://usefulinc.com/ns/doap#license> <https://unlicense.org/1.0/> .
|
19
|
+
<https://rubygems.org/gems/rdf-microdata> <http://usefulinc.com/ns/doap#mailing-list> <http://lists.w3.org/Archives/Public/public-rdf-ruby/> .
|
data/etc/doap.ttl
CHANGED
@@ -1,27 +1,28 @@
|
|
1
|
-
@
|
1
|
+
@base <https://rubygems.org/gems/rdf-microdata> .
|
2
|
+
@prefix dc: <http://purl.org/dc/terms/> .
|
2
3
|
@prefix doap: <http://usefulinc.com/ns/doap#> .
|
3
4
|
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
|
4
|
-
@prefix rdf:
|
5
|
-
@prefix xsd:
|
5
|
+
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
|
6
|
+
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
|
6
7
|
|
7
|
-
|
8
|
-
dc:creator <
|
9
|
-
doap:blog <
|
10
|
-
doap:bug-database <
|
8
|
+
<> a doap:Project;
|
9
|
+
dc:creator <https://greggkellogg.net/foaf#me>;
|
10
|
+
doap:blog <https://greggkellogg.net/>;
|
11
|
+
doap:bug-database <https://github.com/ruby-rdf/rdf-microdata/issues>;
|
11
12
|
doap:category <http://dbpedia.org/resource/Resource_Description_Framework>;
|
12
13
|
doap:created "2011-08-29"^^xsd:date;
|
13
14
|
doap:description """
|
14
15
|
RDF::Microdata is an Microdata reader for Ruby using the RDF.rb library suite.
|
15
16
|
"""@en;
|
16
|
-
doap:developer <
|
17
|
-
doap:documenter <
|
18
|
-
doap:download-page
|
19
|
-
doap:homepage <
|
17
|
+
doap:developer <https://greggkellogg.net/foaf#me>;
|
18
|
+
doap:documenter <https://greggkellogg.net/foaf#me>;
|
19
|
+
doap:download-page <>;
|
20
|
+
doap:homepage <https://github.com/ruby-rdf/rdf-microdata>;
|
20
21
|
doap:implements <http://www.w3.org/TR/microdata-rdf/>;
|
21
|
-
doap:license <
|
22
|
-
doap:mailing-list <
|
23
|
-
doap:maintainer <
|
22
|
+
doap:license <https://unlicense.org/1.0/>;
|
23
|
+
doap:mailing-list <https://lists.w3.org/Archives/Public/public-rdf-ruby/>;
|
24
|
+
doap:maintainer <https://greggkellogg.net/foaf#me>;
|
24
25
|
doap:name "RDF::Microdata";
|
25
|
-
doap:programming-language
|
26
|
+
doap:programming-language "Ruby";
|
26
27
|
doap:shortdesc "Microdata reader for Ruby."@en;
|
27
|
-
foaf:creator <
|
28
|
+
foaf:creator <https://greggkellogg.net/foaf#me> .
|
data/lib/rdf/microdata.rb
CHANGED
@@ -15,18 +15,21 @@ module RDF
|
|
15
15
|
# end
|
16
16
|
# end
|
17
17
|
#
|
18
|
-
# @see
|
19
|
-
# @see
|
18
|
+
# @see https://www.rubydoc.info/github/ruby-rdf/rdf/
|
19
|
+
# @see https://www.w3.org/TR/2011/WD-microdata-20110525/
|
20
20
|
#
|
21
|
-
# @author [Gregg Kellogg](
|
21
|
+
# @author [Gregg Kellogg](https://greggkellogg.net/)
|
22
22
|
module Microdata
|
23
23
|
USES_VOCAB = RDF::URI("http://www.w3.org/ns/rdfa#usesVocabulary")
|
24
|
+
DEFAULT_REGISTRY = File.expand_path("../../../etc/registry.json", __FILE__)
|
24
25
|
|
25
26
|
require 'rdf/microdata/format'
|
26
27
|
require 'rdf/microdata/vocab'
|
27
|
-
autoload :Expansion,
|
28
|
-
autoload :Profile,
|
29
|
-
autoload :
|
30
|
-
autoload :
|
28
|
+
autoload :Expansion, 'rdf/microdata/expansion'
|
29
|
+
autoload :Profile, 'rdf/microdata/profile'
|
30
|
+
autoload :RdfaReader, 'rdf/microdata/rdfa_reader'
|
31
|
+
autoload :Reader, 'rdf/microdata/reader'
|
32
|
+
autoload :Registry, 'rdf/microdata/registry'
|
33
|
+
autoload :VERSION, 'rdf/microdata/version'
|
31
34
|
end
|
32
35
|
end
|
@@ -26,7 +26,6 @@ module RDF::Microdata
|
|
26
26
|
repo = RDF::Repository.new
|
27
27
|
repo << self # Add default graph
|
28
28
|
|
29
|
-
count = repo.count
|
30
29
|
log_debug("expand") {"Loaded #{repo.size} triples into default graph"}
|
31
30
|
|
32
31
|
repo = owl_entailment(repo)
|
@@ -38,7 +37,7 @@ module RDF::Microdata
|
|
38
37
|
end
|
39
38
|
|
40
39
|
def rule(name, &block)
|
41
|
-
Rule.new(name,
|
40
|
+
Rule.new(name, **@options, &block)
|
42
41
|
end
|
43
42
|
|
44
43
|
##
|
@@ -72,7 +71,7 @@ module RDF::Microdata
|
|
72
71
|
# r.execute(queryable) {|statement| puts statement.inspect}
|
73
72
|
#
|
74
73
|
# @param [String] name
|
75
|
-
def initialize(name, options
|
74
|
+
def initialize(name, **options, &block)
|
76
75
|
@antecedents = []
|
77
76
|
@consequents = []
|
78
77
|
@options = options.dup
|
data/lib/rdf/microdata/format.rb
CHANGED
@@ -19,7 +19,7 @@ module RDF::Microdata
|
|
19
19
|
# @example Obtaining serialization format MIME types
|
20
20
|
# RDF::Format.content_types #=> {"text/html" => [RDF::Microdata::Format]}
|
21
21
|
#
|
22
|
-
# @see
|
22
|
+
# @see https://www.w3.org/TR/rdf-testcases/#ntriples
|
23
23
|
class Format < RDF::Format
|
24
24
|
content_encoding 'utf-8'
|
25
25
|
|
@@ -41,5 +41,91 @@ module RDF::Microdata
|
|
41
41
|
def self.detect(sample)
|
42
42
|
!!sample.match(/<[^>]*(itemprop|itemtype|itemref|itemscope|itemid)[^>]*>/m)
|
43
43
|
end
|
44
|
+
|
45
|
+
##
|
46
|
+
# Hash of CLI commands appropriate for this format
|
47
|
+
# @return [Hash{Symbol => Hash}]
|
48
|
+
def self.cli_commands
|
49
|
+
{
|
50
|
+
"to-rdfa": {
|
51
|
+
description: "Transform HTML+Microdata into HTML+RDFa",
|
52
|
+
parse: false,
|
53
|
+
help: "to-rdfa files ...\nTransform HTML+Microdata into HTML+RDFa",
|
54
|
+
filter: {
|
55
|
+
format: :microdata
|
56
|
+
},
|
57
|
+
option_use: {output_format: :disabled},
|
58
|
+
lambda: ->(files, **options) do
|
59
|
+
out = options[:output] || $stdout
|
60
|
+
xsl = Nokogiri::XSLT(%(<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
|
61
|
+
<xsl:param name="indent-increment" select="' '"/>
|
62
|
+
<xsl:output method="html" doctype-system="about:legacy-compat"/>
|
63
|
+
|
64
|
+
<xsl:template name="newline">
|
65
|
+
<xsl:text disable-output-escaping="yes">
|
66
|
+
</xsl:text>
|
67
|
+
</xsl:template>
|
68
|
+
|
69
|
+
<xsl:template match="comment() | processing-instruction()">
|
70
|
+
<xsl:param name="indent" select="''"/>
|
71
|
+
<xsl:call-template name="newline"/>
|
72
|
+
<xsl:value-of select="$indent"/>
|
73
|
+
<xsl:copy />
|
74
|
+
</xsl:template>
|
75
|
+
|
76
|
+
<xsl:template match="text()">
|
77
|
+
<xsl:param name="indent" select="''"/>
|
78
|
+
<xsl:call-template name="newline"/>
|
79
|
+
<xsl:value-of select="$indent"/>
|
80
|
+
<xsl:value-of select="normalize-space(.)"/>
|
81
|
+
</xsl:template>
|
82
|
+
|
83
|
+
<xsl:template match="text()[normalize-space(.)='']"/>
|
84
|
+
|
85
|
+
<xsl:template match="*">
|
86
|
+
<xsl:param name="indent" select="''"/>
|
87
|
+
<xsl:call-template name="newline"/>
|
88
|
+
<xsl:value-of select="$indent"/>
|
89
|
+
<xsl:choose>
|
90
|
+
<xsl:when test="count(child::*) > 0">
|
91
|
+
<xsl:copy>
|
92
|
+
<xsl:copy-of select="@*"/>
|
93
|
+
<xsl:apply-templates select="*|text()">
|
94
|
+
<xsl:with-param name="indent" select="concat ($indent, $indent-increment)"/>
|
95
|
+
</xsl:apply-templates>
|
96
|
+
<xsl:call-template name="newline"/>
|
97
|
+
<xsl:value-of select="$indent"/>
|
98
|
+
</xsl:copy>
|
99
|
+
</xsl:when>
|
100
|
+
<xsl:otherwise>
|
101
|
+
<xsl:copy-of select="."/>
|
102
|
+
</xsl:otherwise>
|
103
|
+
</xsl:choose>
|
104
|
+
</xsl:template>
|
105
|
+
</xsl:stylesheet>).gsub(/^ /, ''))
|
106
|
+
if files.empty?
|
107
|
+
# If files are empty, either use options[::evaluate]
|
108
|
+
input = options[:evaluate] ? StringIO.new(options[:evaluate]) : STDIN
|
109
|
+
input.set_encoding(options.fetch(:encoding, Encoding::UTF_8))
|
110
|
+
RDF::Microdata::Reader.new(input, **options.merge(rdfa: true)) do |reader|
|
111
|
+
reader.rdfa.xpath("//text()").each do |txt|
|
112
|
+
txt.content = txt.content.to_s.strip
|
113
|
+
end
|
114
|
+
out.puts xsl.apply_to(reader.rdfa).to_s
|
115
|
+
end
|
116
|
+
else
|
117
|
+
files.each do |file|
|
118
|
+
RDF::Microdata::Reader.open(file, **options.merge(rdfa: true)) do |reader|
|
119
|
+
reader.rdfa.xpath("//text()").each do |txt|
|
120
|
+
txt.content = txt.content.to_s.strip
|
121
|
+
end
|
122
|
+
out.puts xsl.apply_to(reader.rdfa).to_s
|
123
|
+
end
|
124
|
+
end
|
125
|
+
end
|
126
|
+
end
|
127
|
+
},
|
128
|
+
}
|
129
|
+
end
|
44
130
|
end
|
45
131
|
end
|
@@ -0,0 +1,121 @@
|
|
1
|
+
require 'rdf/rdfa'
|
2
|
+
require 'nokogumbo'
|
3
|
+
|
4
|
+
module RDF::Microdata
|
5
|
+
##
|
6
|
+
# Update DOM to turn Microdata into RDFa and parse using the RDFa Reader
|
7
|
+
class RdfaReader < RDF::RDFa::Reader
|
8
|
+
# The transformed DOM using RDFa
|
9
|
+
# @return [RDF::HTML::Document]
|
10
|
+
attr_reader :rdfa
|
11
|
+
|
12
|
+
def self.format(klass = nil)
|
13
|
+
if klass.nil?
|
14
|
+
RDF::Microdata::Format
|
15
|
+
else
|
16
|
+
super
|
17
|
+
end
|
18
|
+
end
|
19
|
+
|
20
|
+
##
|
21
|
+
# Initializes the RdfaReader instance.
|
22
|
+
#
|
23
|
+
# @param [IO, File, String] input
|
24
|
+
# the input stream to read
|
25
|
+
# @param [Hash{Symbol => Object}] options
|
26
|
+
# any additional options (see `RDF::Reader#initialize`)
|
27
|
+
# @return [reader]
|
28
|
+
# @yield [reader] `self`
|
29
|
+
# @yieldparam [RDF::Reader] reader
|
30
|
+
# @yieldreturn [void] ignored
|
31
|
+
# @raise [RDF::ReaderError] if _validate_
|
32
|
+
def initialize(input = $stdin, **options, &block)
|
33
|
+
@options = options
|
34
|
+
log_debug('', "using RDFa transformation reader")
|
35
|
+
|
36
|
+
input = case input
|
37
|
+
when ::Nokogiri::XML::Document, ::Nokogiri::HTML::Document then input
|
38
|
+
else
|
39
|
+
# Try to detect charset from input
|
40
|
+
options[:encoding] ||= input.charset if input.respond_to?(:charset)
|
41
|
+
|
42
|
+
# Otherwise, default is utf-8
|
43
|
+
options[:encoding] ||= 'utf-8'
|
44
|
+
options[:encoding] = options[:encoding].to_s if options[:encoding]
|
45
|
+
input = input.read if input.respond_to?(:read)
|
46
|
+
::Nokogiri::HTML5(input.force_encoding(options[:encoding]))
|
47
|
+
end
|
48
|
+
|
49
|
+
# For all members having @itemscope
|
50
|
+
input.css("[itemscope]").each do |item|
|
51
|
+
# Get @itemtypes to create @type and @vocab
|
52
|
+
item.attribute('itemscope').remove
|
53
|
+
if item['itemtype']
|
54
|
+
# Only absolute URLs
|
55
|
+
types = item.attribute('itemtype').
|
56
|
+
remove.
|
57
|
+
to_s.
|
58
|
+
split(/\s+/).
|
59
|
+
select {|t| RDF::URI(t).absolute?}
|
60
|
+
|
61
|
+
item['typeof'] = types.join(' ') unless types.empty?
|
62
|
+
if vocab = types.first
|
63
|
+
vocab = begin
|
64
|
+
type_vocab = vocab.to_s.sub(/([\/\#])[^\/\#]*$/, '\1')
|
65
|
+
Registry.new(type_vocab) if type_vocab
|
66
|
+
end
|
67
|
+
item['vocab'] = vocab.uri.to_s if vocab
|
68
|
+
end
|
69
|
+
end
|
70
|
+
item['typeof'] ||= ''
|
71
|
+
|
72
|
+
# Change each itemid attribute to an resource attribute with the same value
|
73
|
+
if item['itemid']
|
74
|
+
id = item.attribute('itemid').remove
|
75
|
+
item['resource'] = id
|
76
|
+
end
|
77
|
+
end
|
78
|
+
|
79
|
+
# Add @resource for all itemprop values of object based on a @data value
|
80
|
+
input.css("object[itemprop][data]").each do |item|
|
81
|
+
item['resource'] ||= item['data']
|
82
|
+
end
|
83
|
+
|
84
|
+
# Replace all @itemprop values with @property
|
85
|
+
input.css("[itemprop]").each {|item| item['property'] = item.attribute('itemprop').remove}
|
86
|
+
|
87
|
+
# Wrap all @itemref properties
|
88
|
+
input.css("[itemref]").each do |item|
|
89
|
+
item_vocab = item['vocab'] || item.ancestors.detect {|a| a.attribute('vocab')}
|
90
|
+
item_vocab = item_vocab.to_s if item_vocab
|
91
|
+
|
92
|
+
item.attribute('itemref').remove.to_s.split(/\s+/).each do |ref|
|
93
|
+
if referenced = input.css("##{ref}")
|
94
|
+
# Add @vocab to referenced using the closest ansestor having @vocab of item.
|
95
|
+
# If the element with id reference has no resource attribute, add a resource attribute whose value is a NUMBER SIGN U+0023 followed by reference to the element.
|
96
|
+
# If the element with id reference has no typeof attribute, add a typeof="rdfa:Pattern" attribute to the element.
|
97
|
+
referenced.wrap(%(<div vocab="#{item_vocab}" resource="##{ref}" typeof="rdfa:Pattern"))
|
98
|
+
|
99
|
+
# Add a link child element to the element that represents the item, with a rel="rdfa:copy" attribute and an href attribute whose value is a NUMBER SIGN U+0023 followed by reference
|
100
|
+
link = ::Nokogiri::XML::Node.new('link', input)
|
101
|
+
link['rel'] = 'rdfa:copy'
|
102
|
+
link['href'] = "##{ref}"
|
103
|
+
item << link
|
104
|
+
end
|
105
|
+
end
|
106
|
+
end
|
107
|
+
|
108
|
+
@rdfa = input
|
109
|
+
log_debug('', "Transformed document: #{input.to_html}")
|
110
|
+
|
111
|
+
options = options.merge(
|
112
|
+
library: :nokogiri,
|
113
|
+
reference_folding: true,
|
114
|
+
host_language: :html5,
|
115
|
+
version: :"rdfa1.1")
|
116
|
+
|
117
|
+
# Rely on RDFa reader
|
118
|
+
super(input, **options, &block)
|
119
|
+
end
|
120
|
+
end
|
121
|
+
end
|