rdf-microdata 0.1.3 → 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (4) hide show
  1. data/README +15 -3
  2. data/VERSION +1 -1
  3. data/lib/rdf/microdata/reader.rb +12 -24
  4. metadata +5 -5
data/README CHANGED
@@ -19,10 +19,13 @@ Install with 'gem install rdf-microdata'
19
19
 
20
20
  graph = RDF::Graph.load("etc/foaf.html", :format => :microdata)
21
21
 
22
+ ## Note
23
+ The Microdata editor has recently [dropped support for RDF
24
+ conversion](http://html5.org/tools/web-apps-tracker?from=6426&to=6427), as a result, this gem is being used to
25
+ investigate ways in which Microdata might have more satisfactory RDF generation.
26
+
22
27
  ### Generating RDF friendly URIs from terms
23
- As defined, Microdata creates ugly (and even illegal) URIs for `@itemprop` entries that are a simple
24
- term, and not already a URI. {RDF::Microdata::Reader} implements a `:rdf\_terms` option which uses an alternative
25
- process for creating URIs from terms: If the `@itemprop` is included within an item having an `@itemtype`,
28
+ If the `@itemprop` is included within an item having an `@itemtype`,
26
29
  the URI of the `@itemtype` will be used for generating a term URI. The type URI will be trimmed following
27
30
  the last '#' or '/' character, and the term will be appended to the resulting URI. This is in keeping
28
31
  with standard convention for defining properties and classes within an RDFS or OWL vocabulary.
@@ -48,6 +51,15 @@ With the `:rdf\_terms` option, this becomes:
48
51
  @prefix schema: <http://schema.org/> .
49
52
  <> md:item [ a schema:Person; schema:name "Gregg" ] .
50
53
 
54
+ ### Improve xsd:date, xsd:time, xsd:dateTime and xsd:duration generation from _time_ element
55
+
56
+ Use the lexical form of the @datetime attribute of the _time_ element to determine the specific type
57
+ of the generated literal.
58
+
59
+ ### Remove implicit RDF triple generation
60
+
61
+ html>head>title and anchor (_a_) elements no longer generate triples without @item* properties
62
+
51
63
  ## Dependencies
52
64
  * [RDF.rb](http://rubygems.org/gems/rdf) (>= 0.3.3)
53
65
  * [Nokogiri](http://rubygems.org/gems/nokogiri) (>= 1.3.3)
data/VERSION CHANGED
@@ -1 +1 @@
1
- 0.1.3
1
+ 0.2.0
@@ -4,9 +4,12 @@ module RDF::Microdata
4
4
  ##
5
5
  # An Microdata parser in Ruby
6
6
  #
7
- # Based on processing rules described here:
8
- # @see http://dev.w3.org/html5/md/
7
+ # Based on processing rules, amended with the following:
8
+ # * property generation from tokens now uses the associated @itemtype as the basis for generation
9
+ # * implicit triples are not generated, only those with @item*
10
+ # * @datetime values are scanned lexically to find appropriate datatype
9
11
  #
12
+ # @see http://dev.w3.org/html5/md/
10
13
  # @author [Gregg Kellogg](http://kellogg-assoc.com/)
11
14
  class Reader < RDF::Reader
12
15
  format Format
@@ -39,8 +42,6 @@ module RDF::Microdata
39
42
  # whether to intern all parsed URIs
40
43
  # @option options [#to_s] :base_uri (nil)
41
44
  # the base URI to use when resolving relative URIs
42
- # @option options [Boolean] :rdf_terms (false)
43
- # Generate URIs for itemprop terms based on namespace of itemtype
44
45
  # @option options [Array] :debug
45
46
  # Array to place debug messages
46
47
  # @return [reader]
@@ -163,19 +164,6 @@ module RDF::Microdata
163
164
  base = RDF::URI("")
164
165
  end
165
166
 
166
- ##
167
- # 1. If the title element is not null, then generate the following triple:
168
- #
169
- # subject: the document's current address
170
- # predicate: http://purl.org/dc/terms/title
171
- # object: the concatenation of the data of all the child text nodes of the title element,
172
- # in tree order, as a plain literal, with the language information set from
173
- # the language of the title element, if it is not unknown.
174
- doc.css('html>head>title').each do |title|
175
- lang = title.attribute('language')
176
- add_triple(title, base, RDF::DC.title, title.inner_text)
177
- end
178
-
179
167
  # 2. For each a, area, and link element in the Document, run these substeps:
180
168
  #
181
169
  # * If the element does not have a rel attribute, then skip this element.
@@ -338,15 +326,10 @@ module RDF::Microdata
338
326
 
339
327
  predicate = if name_uri.absolute?
340
328
  name_uri
341
- elsif @options[:rdf_terms]
329
+ else
342
330
  # Use the URI of the type to create URIs for @itemprop terms
343
331
  add_debug(element, "gentrips: rdf_type=#{rdf_type}")
344
332
  predicate = RDF::URI(rdf_type.to_s.sub(/([\/\#])[^\/\#]*$/, '\1' + name))
345
- elsif !name.include?(':')
346
- s = type.to_s
347
- s += '%20' unless s[-1,1] == ':'
348
- s += name
349
- RDF::MD[s.gsub('#', '%23')]
350
333
  end
351
334
  add_debug(element, "gentrips(6.1.5): predicate=#{predicate}")
352
335
 
@@ -478,7 +461,12 @@ module RDF::Microdata
478
461
  when %w(object).include?(element.name)
479
462
  uri(element.attribute('data'), element.base)
480
463
  when %w(time).include?(element.name) && element.has_attribute?('datetime')
481
- RDF::Literal::DateTime.new(element.attribute('datetime'))
464
+ # Lexically scan value and assign appropriate type, otherwise, leave untyped
465
+ v = element.attribute('datetime').to_s
466
+ datatype = %w(Date Time DateTime).map {|t| RDF::Literal.const_get(t)}.detect do |dt|
467
+ v.match(dt::GRAMMAR)
468
+ end || RDF::Literal
469
+ datatype.new(v)
482
470
  else
483
471
  RDF::Literal.new(element.text, :language => element.language)
484
472
  end
metadata CHANGED
@@ -1,13 +1,13 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: rdf-microdata
3
3
  version: !ruby/object:Gem::Version
4
- hash: 29
4
+ hash: 23
5
5
  prerelease:
6
6
  segments:
7
7
  - 0
8
- - 1
9
- - 3
10
- version: 0.1.3
8
+ - 2
9
+ - 0
10
+ version: 0.2.0
11
11
  platform: ruby
12
12
  authors:
13
13
  - Gregg Kellogg
@@ -15,7 +15,7 @@ autorequire:
15
15
  bindir: bin
16
16
  cert_chain: []
17
17
 
18
- date: 2011-07-26 00:00:00 -07:00
18
+ date: 2011-08-12 00:00:00 -07:00
19
19
  default_executable:
20
20
  dependencies:
21
21
  - !ruby/object:Gem::Dependency