rdf-microdata 2.2.1 → 3.1.2
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +5 -5
- data/README.md +32 -21
- data/UNLICENSE +1 -1
- data/VERSION +1 -1
- data/etc/doap.html +9 -9
- data/etc/doap.nt +19 -19
- data/etc/doap.ttl +20 -21
- data/lib/rdf/microdata.rb +10 -7
- data/lib/rdf/microdata/expansion.rb +2 -3
- data/lib/rdf/microdata/format.rb +87 -1
- data/lib/rdf/microdata/rdfa_reader.rb +121 -0
- data/lib/rdf/microdata/reader.rb +72 -159
- data/lib/rdf/microdata/reader/nokogiri.rb +13 -5
- data/lib/rdf/microdata/registry.rb +109 -0
- metadata +57 -30
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
|
-
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
2
|
+
SHA256:
|
3
|
+
metadata.gz: e56e041f8debb740ed69b8f1aa635b9cf1df26a258dc23f68b05a31cc57586e5
|
4
|
+
data.tar.gz: 63d45db46e96c9544065e4727560693e83d4062d48b9380651d03285954169a4
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 95d8bbbc8ce3bb6fa5ca05366a544424db0add21092a7e3cf474896d78a8401f62d19e4a31ab489712f2ff7cad68581c342623e413444f552815112b413c301b
|
7
|
+
data.tar.gz: 18e36f99b275bf1178a6faa5fc6e7c585e29b67437db36b40a9b5e9aff45368a358a72213be10712d917d8824fe511842d1ca6bc5bace223a5c4aef28d529e04
|
data/README.md
CHANGED
@@ -2,8 +2,10 @@
|
|
2
2
|
|
3
3
|
[Microdata][] parser for RDF.rb.
|
4
4
|
|
5
|
-
[![Gem Version](https://badge.fury.io/rb/rdf-microdata.png)](
|
6
|
-
[![Build Status](https://
|
5
|
+
[![Gem Version](https://badge.fury.io/rb/rdf-microdata.png)](https://badge.fury.io/rb/rdf-microdata)
|
6
|
+
[![Build Status](https://github.com/ruby-rdf/rdf-microdata/workflows/CI/badge.svg?branch=develop)](https://github.com/ruby-rdf/rdf-microdata/actions?query=workflow%3ACI)
|
7
|
+
[![Coverage Status](https://coveralls.io/repos/ruby-rdf/rdf-microdata/badge.svg?branch=develop)](https://coveralls.io/github/ruby-rdf/rdf-microdata?branch=develop)
|
8
|
+
[![Gitter chat](https://badges.gitter.im/ruby-rdf/rdf.png)](https://gitter.im/ruby-rdf/rdf)
|
7
9
|
|
8
10
|
## DESCRIPTION
|
9
11
|
RDF::Microdata is a Microdata reader for Ruby using the [RDF.rb][RDF.rb] library suite.
|
@@ -45,11 +47,12 @@ GRDDL-type triple generation, such as for html>head>title anchor tags.
|
|
45
47
|
If the `RDFa` parser is available, {RDF::Microdata::Format} will not assert content type `text/html` or file extension `.html`, as this is also asserted by RDFa. Instead, the RDFa reader will invoke the microdata reader if an `@itemscope` attribute is detected.
|
46
48
|
|
47
49
|
## Dependencies
|
48
|
-
* [RDF.rb](
|
49
|
-
* [RDF::
|
50
|
-
* [
|
51
|
-
* [
|
52
|
-
*
|
50
|
+
* [RDF.rb](https://rubygems.org/gems/rdf) (~> 3.1)
|
51
|
+
* [RDF::RDFa](https://rubygems.org/gems/rdf-xsd) (~> 3.1)
|
52
|
+
* [RDF::XSD](https://rubygems.org/gems/rdf-xsd) (~> 3.1)
|
53
|
+
* [HTMLEntities](https://rubygems.org/gems/htmlentities) ('~> 4.3')
|
54
|
+
* [Nokogiri](https://rubygems.org/gems/nokogiri) (~> 1.10)
|
55
|
+
* Soft dependency on [Nokogumbo](https://github.com/rubys/nokogumbo) (~> 2.0)
|
53
56
|
|
54
57
|
## Documentation
|
55
58
|
Full documentation available on [Rubydoc.info][Microdata doc]
|
@@ -60,17 +63,23 @@ Full documentation available on [Rubydoc.info][Microdata doc]
|
|
60
63
|
* {RDF::Microdata::Reader}
|
61
64
|
* {RDF::Microdata::Reader::Nokogiri}
|
62
65
|
|
63
|
-
|
66
|
+
|
67
|
+
### RDFa-based Reader
|
68
|
+
There is an experimental reader based on transforming Microdata to RDFa within the DOM. To invoke
|
69
|
+
this, add the `rdfa: true` option to the {RDF::Microdata::Reader.new}, or
|
70
|
+
use {RDF::Microdata::RdfaReader} directly.
|
71
|
+
|
72
|
+
The reader exposes a `#rdfa` method, which can be used to retrieve the transformed HTML+RDFa
|
64
73
|
|
65
74
|
## Resources
|
66
75
|
* [RDF.rb][RDF.rb]
|
67
|
-
* [Documentation](
|
76
|
+
* [Documentation](https://www.rubydoc.info/github/ruby-rdf/rdf-microdata/)
|
68
77
|
* [History](file:History.md)
|
69
78
|
* [Microdata][]
|
70
79
|
* [Microdata RDF][]
|
71
80
|
|
72
81
|
## Author
|
73
|
-
* [Gregg Kellogg](
|
82
|
+
* [Gregg Kellogg](https://github.com/gkellogg) - <https://greggkellogg.net/>
|
74
83
|
|
75
84
|
## Contributing
|
76
85
|
|
@@ -84,25 +93,27 @@ Full documentation available on [Rubydoc.info][Microdata doc]
|
|
84
93
|
list in the the `README`. Alphabetical order applies.
|
85
94
|
* Do note that in order for us to merge any non-trivial changes (as a rule
|
86
95
|
of thumb, additions larger than about 15 lines of code), we need an
|
87
|
-
explicit [public domain dedication][PDD] on record from you
|
96
|
+
explicit [public domain dedication][PDD] on record from you,
|
97
|
+
which you will be asked to agree to on the first commit to a repo within the organization.
|
98
|
+
Note that the agreement applies to all repos in the [Ruby RDF](https://github.com/ruby-rdf/) organization.
|
88
99
|
|
89
100
|
## License
|
90
101
|
|
91
102
|
This is free and unencumbered public domain software. For more information,
|
92
|
-
see <
|
103
|
+
see <https://unlicense.org/> or the accompanying {file:UNLICENSE} file.
|
93
104
|
|
94
105
|
## FEEDBACK
|
95
106
|
|
96
107
|
* gregg@greggkellogg.net
|
97
|
-
* <
|
98
|
-
* <
|
99
|
-
* <
|
108
|
+
* <https://rubygems.org/rdf-microdata>
|
109
|
+
* <https://github.com/ruby-rdf/rdf-microdata>
|
110
|
+
* <https://lists.w3.org/Archives/Public/public-rdf-ruby/>
|
100
111
|
|
101
112
|
[RDF.rb]: https://github.com/ruby-rdf/rdf
|
102
|
-
[YARD]:
|
103
|
-
[YARD-GS]:
|
104
|
-
[PDD]:
|
105
|
-
[Microdata]:
|
106
|
-
[Microdata RDF]:
|
107
|
-
[Microdata doc]:
|
113
|
+
[YARD]: https://yardoc.org/
|
114
|
+
[YARD-GS]: https://rubydoc.info/docs/yard/file/docs/GettingStarted.md
|
115
|
+
[PDD]: https://unlicense.org/#unlicensing-contributions
|
116
|
+
[Microdata]: https://dev.w3.org/html5/md/Overview.html "HTML Microdata"
|
117
|
+
[Microdata RDF]: https://dvcs.w3.org/hg/htmldata/raw-file/default/microdata-rdf/index.html "Microdata to RDF"
|
118
|
+
[Microdata doc]: https://rubydoc.info/github/ruby-rdf/rdf-microdata/frames
|
108
119
|
[Nokogumbo]: https://github.com/rubys/nokogumbo/#readme
|
data/UNLICENSE
CHANGED
@@ -21,4 +21,4 @@ OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
|
|
21
21
|
ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
|
22
22
|
OTHER DEALINGS IN THE SOFTWARE.
|
23
23
|
|
24
|
-
For more information, please refer to <
|
24
|
+
For more information, please refer to <https://unlicense.org/1.0/>
|
data/VERSION
CHANGED
@@ -1 +1 @@
|
|
1
|
-
|
1
|
+
3.1.2
|
data/etc/doap.html
CHANGED
@@ -3,7 +3,7 @@
|
|
3
3
|
<head>
|
4
4
|
<title lang="en" itemprop="shortdesc">Microdata reader for Ruby.</title>
|
5
5
|
<meta http-equiv="Content-Type" content="text/html;charset=utf-8" >
|
6
|
-
<base href="
|
6
|
+
<base href="https://rubygems.org/gems/rdf-microdata">
|
7
7
|
</head>
|
8
8
|
<body>
|
9
9
|
<p>Project description for <span itemprop="name">RDF::Microdata</span>.</p>
|
@@ -18,26 +18,26 @@
|
|
18
18
|
<dt>Created</dt><dd><time itemprop="created" datetime="2011-08-29">2011-08-29</time></dd>
|
19
19
|
<dt>Blog</dt><dd><a href="http://greggkellogg.net/" itemprop="blog">http://greggkellogg.net/</a></dd>
|
20
20
|
<dt>Bug DB</dt><dd>
|
21
|
-
<a href="
|
22
|
-
|
21
|
+
<a href="https://github.com/ruby-rdf/rdf-microdata/issues" itemprop="bug-database">
|
22
|
+
https://github.com/ruby-rdf/rdf-microdata/issues
|
23
23
|
</a>
|
24
24
|
</dd>
|
25
25
|
<dt>Category</dt><dd>
|
26
26
|
<a itemprop="category" href="http://dbpedia.org/resource/Resource_Description_Framework">Resource Description Framework</a>
|
27
27
|
for
|
28
|
-
<
|
28
|
+
<span itemprop="programming-language">Ruby</span>
|
29
29
|
</dd>
|
30
30
|
<dt>Implements</dt><dd>
|
31
31
|
<a itemprop="implements" href="http://www.w3.org/TR/microdata-rdf/">Microdata to RDF</a>
|
32
32
|
</dd>
|
33
|
-
<dt>Download</dt><dd><a href="
|
34
|
-
|
33
|
+
<dt>Download</dt><dd><a href="https://rubygems.org/gems/rdf-microdata" itemprop="download-page">
|
34
|
+
https://rubygems.org/gems/rdf-microdata
|
35
35
|
</a></dd>
|
36
|
-
<dt>Home Page</dt><dd><a href="
|
37
|
-
|
36
|
+
<dt>Home Page</dt><dd><a href="https://github.com/ruby-rdf/rdf-microdata" itemprop="homepage">
|
37
|
+
https://github.com/ruby-rdf/rdf-microdata
|
38
38
|
</a></dd>
|
39
39
|
<dt>License</dt><dd>
|
40
|
-
<a href="
|
40
|
+
<a href="https://unlicense.org/1.0/" itemprop="license">Public Domain</a>
|
41
41
|
</dd>
|
42
42
|
<dt>Mailing List</dt><dd><a href="http://lists.w3.org/Archives/Public/public-rdf-ruby/" itemprop="mailing-list">
|
43
43
|
http://lists.w3.org/Archives/Public/public-rdf-ruby/
|
data/etc/doap.nt
CHANGED
@@ -1,19 +1,19 @@
|
|
1
|
-
<
|
2
|
-
<
|
3
|
-
<
|
4
|
-
<
|
5
|
-
<
|
6
|
-
<
|
7
|
-
<
|
8
|
-
<
|
9
|
-
<
|
10
|
-
<
|
11
|
-
<
|
12
|
-
<
|
13
|
-
<
|
14
|
-
<
|
15
|
-
<
|
16
|
-
<
|
17
|
-
<
|
18
|
-
<
|
19
|
-
<
|
1
|
+
<https://rubygems.org/gems/rdf-microdata> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://usefulinc.com/ns/doap#Project> .
|
2
|
+
<https://rubygems.org/gems/rdf-microdata> <http://usefulinc.com/ns/doap#shortdesc> "Microdata reader for Ruby."@en .
|
3
|
+
<https://rubygems.org/gems/rdf-microdata> <http://usefulinc.com/ns/doap#description> "\n RDF::Microdata is an Microdata reader for Ruby using the RDF.rb library suite.\n "@en .
|
4
|
+
<https://rubygems.org/gems/rdf-microdata> <http://usefulinc.com/ns/doap#name> "RDF::Microdata" .
|
5
|
+
<https://rubygems.org/gems/rdf-microdata> <http://purl.org/dc/terms/creator> <http://greggkellogg.net/foaf#me> .
|
6
|
+
<https://rubygems.org/gems/rdf-microdata> <http://usefulinc.com/ns/doap#developer> <http://greggkellogg.net/foaf#me> .
|
7
|
+
<https://rubygems.org/gems/rdf-microdata> <http://usefulinc.com/ns/doap#documenter> <http://greggkellogg.net/foaf#me> .
|
8
|
+
<https://rubygems.org/gems/rdf-microdata> <http://usefulinc.com/ns/doap#maintainer> <http://greggkellogg.net/foaf#me> .
|
9
|
+
<https://rubygems.org/gems/rdf-microdata> <http://xmlns.com/foaf/0.1/creator> <http://greggkellogg.net/foaf#me> .
|
10
|
+
<https://rubygems.org/gems/rdf-microdata> <http://usefulinc.com/ns/doap#created> "2011-08-29"^^<http://www.w3.org/2001/XMLSchema#date> .
|
11
|
+
<https://rubygems.org/gems/rdf-microdata> <http://usefulinc.com/ns/doap#blog> <http://greggkellogg.net/> .
|
12
|
+
<https://rubygems.org/gems/rdf-microdata> <http://usefulinc.com/ns/doap#bug-database> <https://github.com/ruby-rdf/rdf-microdata/issues> .
|
13
|
+
<https://rubygems.org/gems/rdf-microdata> <http://usefulinc.com/ns/doap#category> <http://dbpedia.org/resource/Resource_Description_Framework> .
|
14
|
+
<https://rubygems.org/gems/rdf-microdata> <http://usefulinc.com/ns/doap#programming-language> "Ruby" .
|
15
|
+
<https://rubygems.org/gems/rdf-microdata> <http://usefulinc.com/ns/doap#implements> <http://www.w3.org/TR/microdata-rdf/> .
|
16
|
+
<https://rubygems.org/gems/rdf-microdata> <http://usefulinc.com/ns/doap#download-page> <https://rubygems.org/gems/rdf-microdata> .
|
17
|
+
<https://rubygems.org/gems/rdf-microdata> <http://usefulinc.com/ns/doap#homepage> <https://github.com/ruby-rdf/rdf-microdata> .
|
18
|
+
<https://rubygems.org/gems/rdf-microdata> <http://usefulinc.com/ns/doap#license> <https://unlicense.org/1.0/> .
|
19
|
+
<https://rubygems.org/gems/rdf-microdata> <http://usefulinc.com/ns/doap#mailing-list> <http://lists.w3.org/Archives/Public/public-rdf-ruby/> .
|
data/etc/doap.ttl
CHANGED
@@ -1,27 +1,26 @@
|
|
1
|
-
@
|
1
|
+
@base <https://rubygems.org/gems/rdf-microdata> .
|
2
|
+
@prefix dc: <http://purl.org/dc/terms/> .
|
2
3
|
@prefix doap: <http://usefulinc.com/ns/doap#> .
|
3
4
|
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
|
4
|
-
@prefix rdf:
|
5
|
-
@prefix xsd:
|
5
|
+
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
|
6
|
+
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
|
6
7
|
|
7
|
-
|
8
|
-
|
9
|
-
doap:
|
10
|
-
doap:
|
8
|
+
<> a doap:Project;
|
9
|
+
doap:name "RDF::Microdata";
|
10
|
+
doap:shortdesc "Microdata reader for Ruby RDF.rb."@en;
|
11
|
+
doap:description "RDF::Microdata is an Microdata reader for Ruby using the RDF.rb library suite."@en;
|
12
|
+
dc:creator <https://greggkellogg.net/foaf#me>;
|
13
|
+
doap:blog <https://greggkellogg.net/>;
|
14
|
+
doap:bug-database <https://github.com/ruby-rdf/rdf-microdata/issues>;
|
11
15
|
doap:category <http://dbpedia.org/resource/Resource_Description_Framework>;
|
12
16
|
doap:created "2011-08-29"^^xsd:date;
|
13
|
-
doap:
|
14
|
-
|
15
|
-
|
16
|
-
doap:
|
17
|
-
doap:documenter <http://greggkellogg.net/foaf#me>;
|
18
|
-
doap:download-page <http://rubygems.org/gems/rdf-microdata>;
|
19
|
-
doap:homepage <http://github.com/ruby-rdf/rdf-microdata>;
|
17
|
+
doap:developer <https://greggkellogg.net/foaf#me>;
|
18
|
+
doap:documenter <https://greggkellogg.net/foaf#me>;
|
19
|
+
doap:download-page <>;
|
20
|
+
doap:homepage <https://github.com/ruby-rdf/rdf-microdata>;
|
20
21
|
doap:implements <http://www.w3.org/TR/microdata-rdf/>;
|
21
|
-
doap:license <
|
22
|
-
doap:mailing-list <
|
23
|
-
doap:maintainer <
|
24
|
-
doap:
|
25
|
-
|
26
|
-
doap:shortdesc "Microdata reader for Ruby."@en;
|
27
|
-
foaf:creator <http://greggkellogg.net/foaf#me> .
|
22
|
+
doap:license <https://unlicense.org/1.0/>;
|
23
|
+
doap:mailing-list <https://lists.w3.org/Archives/Public/public-rdf-ruby/>;
|
24
|
+
doap:maintainer <https://greggkellogg.net/foaf#me>;
|
25
|
+
doap:programming-language "Ruby";
|
26
|
+
foaf:creator <https://greggkellogg.net/foaf#me> .
|
data/lib/rdf/microdata.rb
CHANGED
@@ -15,18 +15,21 @@ module RDF
|
|
15
15
|
# end
|
16
16
|
# end
|
17
17
|
#
|
18
|
-
# @see
|
19
|
-
# @see
|
18
|
+
# @see https://www.rubydoc.info/github/ruby-rdf/rdf/
|
19
|
+
# @see https://www.w3.org/TR/2011/WD-microdata-20110525/
|
20
20
|
#
|
21
|
-
# @author [Gregg Kellogg](
|
21
|
+
# @author [Gregg Kellogg](https://greggkellogg.net/)
|
22
22
|
module Microdata
|
23
23
|
USES_VOCAB = RDF::URI("http://www.w3.org/ns/rdfa#usesVocabulary")
|
24
|
+
DEFAULT_REGISTRY = File.expand_path("../../../etc/registry.json", __FILE__)
|
24
25
|
|
25
26
|
require 'rdf/microdata/format'
|
26
27
|
require 'rdf/microdata/vocab'
|
27
|
-
autoload :Expansion,
|
28
|
-
autoload :Profile,
|
29
|
-
autoload :
|
30
|
-
autoload :
|
28
|
+
autoload :Expansion, 'rdf/microdata/expansion'
|
29
|
+
autoload :Profile, 'rdf/microdata/profile'
|
30
|
+
autoload :RdfaReader, 'rdf/microdata/rdfa_reader'
|
31
|
+
autoload :Reader, 'rdf/microdata/reader'
|
32
|
+
autoload :Registry, 'rdf/microdata/registry'
|
33
|
+
autoload :VERSION, 'rdf/microdata/version'
|
31
34
|
end
|
32
35
|
end
|
@@ -26,7 +26,6 @@ module RDF::Microdata
|
|
26
26
|
repo = RDF::Repository.new
|
27
27
|
repo << self # Add default graph
|
28
28
|
|
29
|
-
count = repo.count
|
30
29
|
log_debug("expand") {"Loaded #{repo.size} triples into default graph"}
|
31
30
|
|
32
31
|
repo = owl_entailment(repo)
|
@@ -38,7 +37,7 @@ module RDF::Microdata
|
|
38
37
|
end
|
39
38
|
|
40
39
|
def rule(name, &block)
|
41
|
-
Rule.new(name,
|
40
|
+
Rule.new(name, **@options, &block)
|
42
41
|
end
|
43
42
|
|
44
43
|
##
|
@@ -72,7 +71,7 @@ module RDF::Microdata
|
|
72
71
|
# r.execute(queryable) {|statement| puts statement.inspect}
|
73
72
|
#
|
74
73
|
# @param [String] name
|
75
|
-
def initialize(name, options
|
74
|
+
def initialize(name, **options, &block)
|
76
75
|
@antecedents = []
|
77
76
|
@consequents = []
|
78
77
|
@options = options.dup
|
data/lib/rdf/microdata/format.rb
CHANGED
@@ -19,7 +19,7 @@ module RDF::Microdata
|
|
19
19
|
# @example Obtaining serialization format MIME types
|
20
20
|
# RDF::Format.content_types #=> {"text/html" => [RDF::Microdata::Format]}
|
21
21
|
#
|
22
|
-
# @see
|
22
|
+
# @see https://www.w3.org/TR/rdf-testcases/#ntriples
|
23
23
|
class Format < RDF::Format
|
24
24
|
content_encoding 'utf-8'
|
25
25
|
|
@@ -41,5 +41,91 @@ module RDF::Microdata
|
|
41
41
|
def self.detect(sample)
|
42
42
|
!!sample.match(/<[^>]*(itemprop|itemtype|itemref|itemscope|itemid)[^>]*>/m)
|
43
43
|
end
|
44
|
+
|
45
|
+
##
|
46
|
+
# Hash of CLI commands appropriate for this format
|
47
|
+
# @return [Hash{Symbol => Hash}]
|
48
|
+
def self.cli_commands
|
49
|
+
{
|
50
|
+
"to-rdfa": {
|
51
|
+
description: "Transform HTML+Microdata into HTML+RDFa",
|
52
|
+
parse: false,
|
53
|
+
help: "to-rdfa files ...\nTransform HTML+Microdata into HTML+RDFa",
|
54
|
+
filter: {
|
55
|
+
format: :microdata
|
56
|
+
},
|
57
|
+
option_use: {output_format: :disabled},
|
58
|
+
lambda: ->(files, **options) do
|
59
|
+
out = options[:output] || $stdout
|
60
|
+
xsl = Nokogiri::XSLT(%(<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
|
61
|
+
<xsl:param name="indent-increment" select="' '"/>
|
62
|
+
<xsl:output method="html" doctype-system="about:legacy-compat"/>
|
63
|
+
|
64
|
+
<xsl:template name="newline">
|
65
|
+
<xsl:text disable-output-escaping="yes">
|
66
|
+
</xsl:text>
|
67
|
+
</xsl:template>
|
68
|
+
|
69
|
+
<xsl:template match="comment() | processing-instruction()">
|
70
|
+
<xsl:param name="indent" select="''"/>
|
71
|
+
<xsl:call-template name="newline"/>
|
72
|
+
<xsl:value-of select="$indent"/>
|
73
|
+
<xsl:copy />
|
74
|
+
</xsl:template>
|
75
|
+
|
76
|
+
<xsl:template match="text()">
|
77
|
+
<xsl:param name="indent" select="''"/>
|
78
|
+
<xsl:call-template name="newline"/>
|
79
|
+
<xsl:value-of select="$indent"/>
|
80
|
+
<xsl:value-of select="normalize-space(.)"/>
|
81
|
+
</xsl:template>
|
82
|
+
|
83
|
+
<xsl:template match="text()[normalize-space(.)='']"/>
|
84
|
+
|
85
|
+
<xsl:template match="*">
|
86
|
+
<xsl:param name="indent" select="''"/>
|
87
|
+
<xsl:call-template name="newline"/>
|
88
|
+
<xsl:value-of select="$indent"/>
|
89
|
+
<xsl:choose>
|
90
|
+
<xsl:when test="count(child::*) > 0">
|
91
|
+
<xsl:copy>
|
92
|
+
<xsl:copy-of select="@*"/>
|
93
|
+
<xsl:apply-templates select="*|text()">
|
94
|
+
<xsl:with-param name="indent" select="concat ($indent, $indent-increment)"/>
|
95
|
+
</xsl:apply-templates>
|
96
|
+
<xsl:call-template name="newline"/>
|
97
|
+
<xsl:value-of select="$indent"/>
|
98
|
+
</xsl:copy>
|
99
|
+
</xsl:when>
|
100
|
+
<xsl:otherwise>
|
101
|
+
<xsl:copy-of select="."/>
|
102
|
+
</xsl:otherwise>
|
103
|
+
</xsl:choose>
|
104
|
+
</xsl:template>
|
105
|
+
</xsl:stylesheet>).gsub(/^ /, ''))
|
106
|
+
if files.empty?
|
107
|
+
# If files are empty, either use options[::evaluate]
|
108
|
+
input = options[:evaluate] ? StringIO.new(options[:evaluate]) : STDIN
|
109
|
+
input.set_encoding(options.fetch(:encoding, Encoding::UTF_8))
|
110
|
+
RDF::Microdata::Reader.new(input, **options.merge(rdfa: true)) do |reader|
|
111
|
+
reader.rdfa.xpath("//text()").each do |txt|
|
112
|
+
txt.content = txt.content.to_s.strip
|
113
|
+
end
|
114
|
+
out.puts xsl.apply_to(reader.rdfa).to_s
|
115
|
+
end
|
116
|
+
else
|
117
|
+
files.each do |file|
|
118
|
+
RDF::Microdata::Reader.open(file, **options.merge(rdfa: true)) do |reader|
|
119
|
+
reader.rdfa.xpath("//text()").each do |txt|
|
120
|
+
txt.content = txt.content.to_s.strip
|
121
|
+
end
|
122
|
+
out.puts xsl.apply_to(reader.rdfa).to_s
|
123
|
+
end
|
124
|
+
end
|
125
|
+
end
|
126
|
+
end
|
127
|
+
},
|
128
|
+
}
|
129
|
+
end
|
44
130
|
end
|
45
131
|
end
|
@@ -0,0 +1,121 @@
|
|
1
|
+
require 'rdf/rdfa'
|
2
|
+
require 'nokogumbo'
|
3
|
+
|
4
|
+
module RDF::Microdata
|
5
|
+
##
|
6
|
+
# Update DOM to turn Microdata into RDFa and parse using the RDFa Reader
|
7
|
+
class RdfaReader < RDF::RDFa::Reader
|
8
|
+
# The transformed DOM using RDFa
|
9
|
+
# @return [RDF::HTML::Document]
|
10
|
+
attr_reader :rdfa
|
11
|
+
|
12
|
+
def self.format(klass = nil)
|
13
|
+
if klass.nil?
|
14
|
+
RDF::Microdata::Format
|
15
|
+
else
|
16
|
+
super
|
17
|
+
end
|
18
|
+
end
|
19
|
+
|
20
|
+
##
|
21
|
+
# Initializes the RdfaReader instance.
|
22
|
+
#
|
23
|
+
# @param [IO, File, String] input
|
24
|
+
# the input stream to read
|
25
|
+
# @param [Hash{Symbol => Object}] options
|
26
|
+
# any additional options (see `RDF::Reader#initialize`)
|
27
|
+
# @return [reader]
|
28
|
+
# @yield [reader] `self`
|
29
|
+
# @yieldparam [RDF::Reader] reader
|
30
|
+
# @yieldreturn [void] ignored
|
31
|
+
# @raise [RDF::ReaderError] if _validate_
|
32
|
+
def initialize(input = $stdin, **options, &block)
|
33
|
+
@options = options
|
34
|
+
log_debug('', "using RDFa transformation reader")
|
35
|
+
|
36
|
+
input = case input
|
37
|
+
when ::Nokogiri::XML::Document, ::Nokogiri::HTML::Document then input
|
38
|
+
else
|
39
|
+
# Try to detect charset from input
|
40
|
+
options[:encoding] ||= input.charset if input.respond_to?(:charset)
|
41
|
+
|
42
|
+
# Otherwise, default is utf-8
|
43
|
+
options[:encoding] ||= 'utf-8'
|
44
|
+
options[:encoding] = options[:encoding].to_s if options[:encoding]
|
45
|
+
input = input.read if input.respond_to?(:read)
|
46
|
+
::Nokogiri::HTML5(input.force_encoding(options[:encoding]))
|
47
|
+
end
|
48
|
+
|
49
|
+
# For all members having @itemscope
|
50
|
+
input.css("[itemscope]").each do |item|
|
51
|
+
# Get @itemtypes to create @type and @vocab
|
52
|
+
item.attribute('itemscope').remove
|
53
|
+
if item['itemtype']
|
54
|
+
# Only absolute URLs
|
55
|
+
types = item.attribute('itemtype').
|
56
|
+
remove.
|
57
|
+
to_s.
|
58
|
+
split(/\s+/).
|
59
|
+
select {|t| RDF::URI(t).absolute?}
|
60
|
+
|
61
|
+
item['typeof'] = types.join(' ') unless types.empty?
|
62
|
+
if vocab = types.first
|
63
|
+
vocab = begin
|
64
|
+
type_vocab = vocab.to_s.sub(/([\/\#])[^\/\#]*$/, '\1')
|
65
|
+
Registry.new(type_vocab) if type_vocab
|
66
|
+
end
|
67
|
+
item['vocab'] = vocab.uri.to_s if vocab
|
68
|
+
end
|
69
|
+
end
|
70
|
+
item['typeof'] ||= ''
|
71
|
+
|
72
|
+
# Change each itemid attribute to an resource attribute with the same value
|
73
|
+
if item['itemid']
|
74
|
+
id = item.attribute('itemid').remove
|
75
|
+
item['resource'] = id
|
76
|
+
end
|
77
|
+
end
|
78
|
+
|
79
|
+
# Add @resource for all itemprop values of object based on a @data value
|
80
|
+
input.css("object[itemprop][data]").each do |item|
|
81
|
+
item['resource'] ||= item['data']
|
82
|
+
end
|
83
|
+
|
84
|
+
# Replace all @itemprop values with @property
|
85
|
+
input.css("[itemprop]").each {|item| item['property'] = item.attribute('itemprop').remove}
|
86
|
+
|
87
|
+
# Wrap all @itemref properties
|
88
|
+
input.css("[itemref]").each do |item|
|
89
|
+
item_vocab = item['vocab'] || item.ancestors.detect {|a| a.attribute('vocab')}
|
90
|
+
item_vocab = item_vocab.to_s if item_vocab
|
91
|
+
|
92
|
+
item.attribute('itemref').remove.to_s.split(/\s+/).each do |ref|
|
93
|
+
if referenced = input.css("##{ref}")
|
94
|
+
# Add @vocab to referenced using the closest ansestor having @vocab of item.
|
95
|
+
# If the element with id reference has no resource attribute, add a resource attribute whose value is a NUMBER SIGN U+0023 followed by reference to the element.
|
96
|
+
# If the element with id reference has no typeof attribute, add a typeof="rdfa:Pattern" attribute to the element.
|
97
|
+
referenced.wrap(%(<div vocab="#{item_vocab}" resource="##{ref}" typeof="rdfa:Pattern"))
|
98
|
+
|
99
|
+
# Add a link child element to the element that represents the item, with a rel="rdfa:copy" attribute and an href attribute whose value is a NUMBER SIGN U+0023 followed by reference
|
100
|
+
link = ::Nokogiri::XML::Node.new('link', input)
|
101
|
+
link['rel'] = 'rdfa:copy'
|
102
|
+
link['href'] = "##{ref}"
|
103
|
+
item << link
|
104
|
+
end
|
105
|
+
end
|
106
|
+
end
|
107
|
+
|
108
|
+
@rdfa = input
|
109
|
+
log_debug('', "Transformed document: #{input.to_html}")
|
110
|
+
|
111
|
+
options = options.merge(
|
112
|
+
library: :nokogiri,
|
113
|
+
reference_folding: true,
|
114
|
+
host_language: :html5,
|
115
|
+
version: :"rdfa1.1")
|
116
|
+
|
117
|
+
# Rely on RDFa reader
|
118
|
+
super(input, **options, &block)
|
119
|
+
end
|
120
|
+
end
|
121
|
+
end
|