mida 0.1.3 → 0.2.0

Sign up to get free protection for your applications and to get access to all the features.
data/CHANGELOG.rdoc CHANGED
@@ -1,3 +1,8 @@
1
+ == 0.2.0 (3rd May 2011)
2
+ * Add ability to describe and conform to vocabularies
3
+ * Rename Mida::Property to Mida::Itemprop to better reflect use
4
+ * Make some of the Mida::Itemprop class methods private
5
+
1
6
  == 0.1.3 (18th April 2011)
2
7
  * Ensure itemprops are parsed properly if containing non-microdata elements
3
8
  * Support itemprops nested within other itemprops
data/README.rdoc CHANGED
@@ -10,7 +10,7 @@ This is based on the latest Published version of the Microdata Specification
10
10
  dated {5th April 2011}[http://www.w3.org/TR/2011/WD-microdata-20110405/].
11
11
 
12
12
  == Installation
13
- With Ruby and Rubygems:
13
+ Mida keeps RubyGems[http://rubygems.org/gems/mida] up-to-date with its latest version, so installing is as easy as:
14
14
  gem install mida
15
15
 
16
16
  === Requirements:
@@ -58,6 +58,29 @@ values will be an array of either +String+ or <tt>Mida::Item</tt> instances.
58
58
  To see the +properties+ of the +Item+:
59
59
  puts doc.items.first.properties
60
60
 
61
+ === Working with Vocabularies
62
+ Mida allows you to define vocabularies, so that input data can be constrained to match
63
+ expected patterns. By default a generic vocabulary (<tt>Mida::Vocabulary::Generic</tt>)
64
+ is registered which will match against any +itemtype+ with any number of properties.
65
+
66
+ If you want to specify a vocabulary you create a class derived from <tt>Mida::VocabularyDesc</tt>
67
+ and use +itemtype+, +has_one+, +has_many+ and +types+ to describe the vocabulary.
68
+
69
+ As an example the following describes a subset of Google's Review vocabulary:
70
+ class Review < Mida::VocabularyDesc
71
+ itemtype %r{http://data-vocabulary.org/review}
72
+ has_one 'itemreviewed'
73
+ has_one 'rating'
74
+ end
75
+
76
+ To register the above Vocabulary use:
77
+ Mida::Vocabulary.register(Review)
78
+
79
+ Now if Mida is parsing some input and manages to match against the +Review+ +itemtype+, it
80
+ will only allow the specified properties and will reject any that don't have the correct number. It
81
+ will also set <tt>Item#vocabulary</tt> accordingly, e.g.
82
+ doc.items.first.vocabulary # => Review
83
+
61
84
  == Bugs/Feature Requests
62
85
  If you find a bug or want to make a feature request, please report it at the
63
86
  Mida project's {issues tracker}[https://github.com/LawrenceWoodman/mida/issues]
data/Rakefile CHANGED
@@ -6,7 +6,7 @@ spec = Gem::Specification.new do |s|
6
6
  s.name = "mida"
7
7
  s.summary = "A Microdata parser/extractor library"
8
8
  s.description = "A Microdata parser and extractor library, based on the latest published version of the Microdata Specification, dated 5th April 2011."
9
- s.version = "0.1.3"
9
+ s.version = "0.2.0"
10
10
  s.author = "Lawrence Woodman"
11
11
  s.email = "lwoodman@vlifesystems.com"
12
12
  s.homepage = %q{http://github.com/LawrenceWoodman/mida}
data/lib/mida.rb CHANGED
@@ -3,4 +3,7 @@ Dir[File.dirname(__FILE__) + '/mida/*.rb'].each { |f| require f }
3
3
 
4
4
  # Mida is a Microdata parser and extractor.
5
5
  module Mida
6
+
6
7
  end
8
+
9
+ require_relative 'mida/vocabulary/generic'
data/lib/mida/document.rb CHANGED
@@ -11,7 +11,7 @@ module Mida
11
11
 
12
12
  # Create a new Microdata object
13
13
  #
14
- # [target] The string containing the html that you want to parse
14
+ # [target] The string containing the html that you want to parse.
15
15
  # [page_url] The url of target used for form absolute urls. This must
16
16
  # include the filename, e.g. index.html.
17
17
  def initialize(target, page_url=nil)
@@ -23,26 +23,31 @@ module Mida
23
23
  # Returns an array of matching Mida::Item objects
24
24
  #
25
25
  # [vocabulary] A regexp to match the item types against
26
+ # or a Class derived from Mida::VocabularyDesc
27
+ # to match against
26
28
  def search(vocabulary, items=@items)
27
29
  found_items = []
30
+ regexp_passed = vocabulary.kind_of?(Regexp)
31
+ regexp = if regexp_passed then vocabulary else vocabulary.itemtype end
32
+
28
33
  items.each do |item|
29
34
  # Allows matching against empty string, otherwise couldn't match
30
35
  # as item.type can be nil
31
- if (item.type.nil? && "" =~ vocabulary) || (item.type =~ vocabulary)
36
+ if (item.type.nil? && "" =~ regexp) || (item.type =~ regexp)
32
37
  found_items << item
33
38
  end
34
- found_items += search_values(item.properties.values, vocabulary)
39
+ found_items += search_values(item.properties.values, regexp)
35
40
  end
36
41
  found_items
37
42
  end
38
43
 
39
44
  private
40
45
  def extract_items
41
- items_doc = @doc.search('//*[@itemscope and not(@itemprop)]')
42
- return nil unless items_doc
46
+ itemscopes = @doc.search('//*[@itemscope and not(@itemprop)]')
47
+ return nil unless itemscopes
43
48
 
44
- items_doc.collect do |item_doc|
45
- Item.new(item_doc, @page_url)
49
+ itemscopes.collect do |itemscope|
50
+ Item.new(itemscope, @page_url)
46
51
  end
47
52
  end
48
53
 
data/lib/mida/item.rb CHANGED
@@ -4,6 +4,9 @@ module Mida
4
4
 
5
5
  # Class that holds each item/itemscope
6
6
  class Item
7
+ # The vocabulary used to interpret this item
8
+ attr_reader :vocabulary
9
+
7
10
  # The Type of the item
8
11
  attr_reader :type
9
12
 
@@ -17,20 +20,27 @@ module Mida
17
20
 
18
21
  # Create a new Item object
19
22
  #
20
- # [itemscope] The itemscope that you want to parse
21
- # [page_url] The url of target used for form absolute urls
23
+ # [itemscope] The itemscope that you want to parse.
24
+ # [page_url] The url of target used for form absolute url.
22
25
  def initialize(itemscope, page_url=nil)
23
26
  @itemscope, @page_url = itemscope, page_url
24
27
  @type, @id = extract_attribute('itemtype'), extract_attribute('itemid')
28
+ @vocabulary = Mida::Vocabulary.find(@type)
25
29
  @properties = {}
26
30
  add_itemref_properties
27
- traverse_elements(extract_elements(itemscope))
31
+ parse_elements(extract_elements(@itemscope))
32
+ validate_properties
28
33
  end
29
34
 
30
35
  # Return a Hash representation
31
- # of the form {type: 'The item type', properties: {'a name' => 'avalue' }}
36
+ # of the form:
37
+ # { vocabulary: 'http://example.com/vocab/review',
38
+ # type: 'The item type',
39
+ # id: 'urn:isbn:1-934356-08-5',
40
+ # properties: {'a name' => 'avalue' }
41
+ # }
32
42
  def to_h
33
- {type: @type, id: @id, properties: properties_to_h(@properties)}
43
+ {vocabulary: @vocabulary, type: @type, id: @id, properties: properties_to_h(@properties)}
34
44
  end
35
45
 
36
46
  def to_s
@@ -38,11 +48,55 @@ module Mida
38
48
  end
39
49
 
40
50
  def ==(other)
41
- @type == other.type and @id == other.id and @properties == other.properties
51
+ @vocabulary == other.vocabulary && @type == other.type &&
52
+ @id == other.id && @properties == other.properties
42
53
  end
43
54
 
44
55
  private
45
56
 
57
+ # Validate the properties so that they are in their proper form
58
+ def validate_properties
59
+ @properties =
60
+ @properties.each_with_object({}) do |(property, values), hash|
61
+ if valid_property?(property, values)
62
+ hash[property] = valid_values(property, values)
63
+ end
64
+ end
65
+ end
66
+
67
+ # Return whether the number of values conforms to the spec
68
+ def valid_num_values?(property, values)
69
+ return false unless @vocabulary.prop_spec.has_key?(property)
70
+ property_spec = @vocabulary.prop_spec[property]
71
+ (property_spec[:num] == :many ||
72
+ (property_spec[:num] == :one && values.length == 1))
73
+ end
74
+
75
+ def valid_property?(property, values)
76
+ [property, :any].any? {|prop| valid_num_values?(prop, values)}
77
+ end
78
+
79
+ def valid_values(property, values)
80
+ prop_types = if @vocabulary.prop_spec.has_key?(property)
81
+ @vocabulary.prop_spec[property][:types]
82
+ else
83
+ @vocabulary.prop_spec[:any][:types]
84
+ end
85
+
86
+ values.select {|value| valid_type(prop_types, value) }
87
+ end
88
+
89
+ def valid_type(prop_types, value)
90
+ if value.respond_to?(:vocabulary)
91
+ if prop_types.include?(value.vocabulary) || prop_types.include?(:any)
92
+ return true
93
+ end
94
+ elsif prop_types.include?(value.class) || prop_types.include?(:any)
95
+ return true
96
+ end
97
+ false
98
+ end
99
+
46
100
  def extract_attribute(attribute)
47
101
  (value = @itemscope.attribute(attribute)) ? value.value : nil
48
102
  end
@@ -66,31 +120,33 @@ module Mida
66
120
  end
67
121
 
68
122
  def properties_to_h(properties)
69
- hash = {}
70
- properties.each { |name, value| hash[name] = value_to_h(value) }
71
- hash
123
+ properties.each_with_object({}) do |(name, value), hash|
124
+ hash[name] = value_to_h(value)
125
+ end
72
126
  end
73
127
 
74
128
  # Add any properties referred to by 'itemref'
75
129
  def add_itemref_properties
76
130
  itemref = extract_attribute('itemref')
77
131
  if itemref
78
- itemref.split.each {|id| traverse_elements(find_with_id(id))}
132
+ itemref.split.each {|id| parse_elements(find_with_id(id))}
79
133
  end
80
134
  end
81
135
 
82
- def traverse_elements(elements)
83
- elements.each do |element|
84
- itemscope = element.attribute('itemscope')
85
- itemprop = element.attribute('itemprop')
86
- internal_elements = extract_elements(element)
87
- add_itemprop(element) if itemscope || itemprop
88
- traverse_elements(internal_elements) if internal_elements && !itemscope
89
- end
136
+ def parse_elements(elements)
137
+ elements.each {|element| parse_element(element)}
138
+ end
139
+
140
+ def parse_element(element)
141
+ itemscope = element.attribute('itemscope')
142
+ itemprop = element.attribute('itemprop')
143
+ internal_elements = extract_elements(element)
144
+ add_itemprop(element) if itemscope || itemprop
145
+ parse_elements(internal_elements) if internal_elements && !itemscope
90
146
  end
91
147
 
92
148
  def add_itemprop(itemprop)
93
- properties = Property.parse(itemprop, @page_url)
149
+ properties = Itemprop.parse(itemprop, @page_url)
94
150
  properties.each { |name, value| (@properties[name] ||= []) << value }
95
151
  end
96
152
 
@@ -4,7 +4,7 @@ require 'uri'
4
4
  module Mida
5
5
 
6
6
  # Module that parses itemprop elements
7
- module Property
7
+ module Itemprop
8
8
 
9
9
  # Returns a Hash representing the property.
10
10
  # Hash is of the form {'property name' => 'value'}
@@ -63,6 +63,9 @@ module Mida
63
63
  end
64
64
  end
65
65
 
66
+ private_class_method :make_absolute_url, :extract_property_names
67
+ private_class_method :extract_property_value, :extract_property
68
+
66
69
  end
67
70
 
68
71
  end
@@ -0,0 +1,26 @@
1
+ module Mida
2
+
3
+ # Module to register the Vocabularies with
4
+ module Vocabulary
5
+
6
+ # Register a vocabulary that can be used when parsing,
7
+ # later vocabularies are given precedence over earlier ones
8
+ def self.register(vocabulary)
9
+ (@vocabularies ||= []) << vocabulary
10
+ end
11
+
12
+ # Find the last vocabulary registered that matches the itemtype
13
+ def self.find(itemtype)
14
+ @vocabularies.reverse_each do |vocabulary|
15
+ if ((itemtype || "") =~ vocabulary.itemtype) then return vocabulary end
16
+ end
17
+ nil
18
+ end
19
+
20
+ # Return the registered vocabularies
21
+ def self.vocabularies
22
+ @vocabularies
23
+ end
24
+
25
+ end
26
+ end
@@ -0,0 +1,15 @@
1
+ module Mida
2
+ module Vocabulary
3
+
4
+ # A Generic vocabulary that will match against anything
5
+ class Generic < Mida::VocabularyDesc
6
+ itemtype %r{}
7
+ has_many :any do
8
+ types :any
9
+ end
10
+ end
11
+
12
+ register(Generic)
13
+ end
14
+
15
+ end
@@ -0,0 +1,57 @@
1
+ module Mida
2
+
3
+ # Class used to describe a vocabulary
4
+ #
5
+ # To specify a vocabulary use the following methods:
6
+ # +itemtype+, +has_one+, +has_many+, +types+
7
+ class VocabularyDesc
8
+
9
+ # Sets the regular expression to match against the +itemtype+
10
+ # or returns the current regular expression
11
+ def self.itemtype(regexp_arg=nil)
12
+ if regexp_arg
13
+ @itemtype = regexp_arg
14
+ else
15
+ @itemtype
16
+ end
17
+ end
18
+
19
+ # Getter to read the created propeties specification
20
+ def self.prop_spec
21
+ @prop_spec || {}
22
+ end
23
+
24
+ # The types a property can take. E.g. +String+, or another +Vocabulary+
25
+ # If you want to say any type, then use +:any+ as the class
26
+ # This should be used within a +has_one+ or +has_many+ block
27
+ def self.types(*type_classes)
28
+ {types: type_classes}
29
+ end
30
+
31
+ # Defines the properties as only containing one value
32
+ # If want to say any property name, then use +:any+ as a name
33
+ def self.has_one(*property_names, &block)
34
+ has(:one, *property_names, &block)
35
+ end
36
+
37
+ # Defines the properties as containing many values
38
+ # If want to say any property name, then use +:any+ as a name
39
+ def self.has_many(*property_names, &block)
40
+ has(:many, *property_names, &block)
41
+ end
42
+
43
+ def self.has(num, *property_names, &block)
44
+ @prop_spec ||= {}
45
+ property_names.each_with_object(@prop_spec) do |name, prop_spec|
46
+ prop_spec[name] = if block_given?
47
+ {num: num}.merge(yield)
48
+ else
49
+ {num: num, types: [String]}
50
+ end
51
+ end
52
+ end
53
+
54
+ private_class_method :has
55
+
56
+ end
57
+ end
@@ -94,8 +94,10 @@ describe Mida::Document, 'when run with a document containing textContent and no
94
94
 
95
95
  it 'should return all the properties and types with the correct values' do
96
96
  expected_results = [
97
- { type: nil, id: nil, properties: {'link_field' => ['']} },
98
- { type: nil,
97
+ { vocabulary: Mida::Vocabulary::Generic,
98
+ type: nil, id: nil, properties: {'link_field' => ['']} },
99
+ { vocabulary: Mida::Vocabulary::Generic,
100
+ type: nil,
99
101
  id: nil,
100
102
  properties: {
101
103
  'span_field' => ['Some span content'],
@@ -130,11 +132,13 @@ describe Mida::Document, 'when run with a document containing textContent and no
130
132
 
131
133
  it 'should return all the properties and types with the correct values' do
132
134
  expected_results = [
133
- { type: nil, id: nil, properties: {
135
+ { vocabulary: Mida::Vocabulary::Generic,
136
+ type: nil, id: nil, properties: {
134
137
  'link_field' => ['http://example.com/start/stylesheet.css']
135
138
  }
136
139
  },
137
- { type: nil,
140
+ { vocabulary: Mida::Vocabulary::Generic,
141
+ type: nil,
138
142
  id: nil,
139
143
  properties: {
140
144
  'span_field' => ['Some span content'],
@@ -192,6 +196,7 @@ describe Mida::Document, 'when run against a full html document containing one i
192
196
 
193
197
  it 'should return all the properties and types with the correct values' do
194
198
  expected_results = [{
199
+ vocabulary: Mida::Vocabulary::Generic,
195
200
  type: nil,
196
201
  id: nil,
197
202
  properties: {
@@ -238,11 +243,13 @@ describe Mida::Document, 'when run against a full html document containing one i
238
243
 
239
244
  it 'should return all the properties and types with the correct values' do
240
245
  expected_results = [{
246
+ vocabulary: Mida::Vocabulary::Generic,
241
247
  type: nil,
242
248
  id: nil,
243
249
  properties: {
244
250
  'itemreviewed' => ['Romeo Pizza'],
245
251
  'address' => [{
252
+ vocabulary: Mida::Vocabulary::Generic,
246
253
  type: nil, id: nil, properties: {
247
254
  'firstline' => ['237 Italian Way'],
248
255
  'country' => ['United Kingdom']
@@ -287,15 +294,18 @@ describe Mida::Document, 'when run against a full html document containing one i
287
294
 
288
295
  it 'should return all the properties and types with the correct values' do
289
296
  expected_results = [{
297
+ vocabulary: Mida::Vocabulary::Generic,
290
298
  type: nil,
291
299
  id: nil,
292
300
  properties: {
293
301
  'itemreviewed' => ['Romeo Pizza'],
294
302
  'address' => [{
303
+ vocabulary: Mida::Vocabulary::Generic,
295
304
  type: nil,
296
305
  id: nil,
297
306
  properties: {
298
307
  'firstline' => [{
308
+ vocabulary: Mida::Vocabulary::Generic,
299
309
  type: nil,
300
310
  id: nil,
301
311
  properties: {
@@ -351,6 +361,7 @@ describe Mida::Document, 'when run against a full html document containing one i
351
361
 
352
362
  it 'should return all the properties and types with the correct values' do
353
363
  expected_results = [{
364
+ vocabulary: Mida::Vocabulary::Generic,
354
365
  type: 'http://data-vocabulary.org/Review',
355
366
  id: nil,
356
367
  properties: {
@@ -412,6 +423,7 @@ describe Mida::Document, 'when run against a full html document containing two n
412
423
 
413
424
  it 'should return all the properties and types with the correct values for 1st itemscope' do
414
425
  expected_results = [{
426
+ vocabulary: Mida::Vocabulary::Generic,
415
427
  type: 'http://data-vocabulary.org/Review',
416
428
  id: nil,
417
429
  properties: {
@@ -424,6 +436,7 @@ describe Mida::Document, 'when run against a full html document containing two n
424
436
 
425
437
  it 'should return all the properties from the text for 2nd itemscope' do
426
438
  expected_results = [{
439
+ vocabulary: Mida::Vocabulary::Generic,
427
440
  type: 'http://data-vocabulary.org/Organization',
428
441
  id: nil,
429
442
  properties: {
@@ -474,12 +487,14 @@ describe Mida::Document, 'when run against a full html document containing one
474
487
  context "when looking at the outer vocabulary" do
475
488
  it 'should return all the properties from the text with the correct values' do
476
489
  expected_results = [{
490
+ vocabulary: Mida::Vocabulary::Generic,
477
491
  type: 'http://data-vocabulary.org/Product',
478
492
  id: nil,
479
493
  properties: {
480
494
  'name' => ['DC07'],
481
495
  'brand' => ['Dyson'],
482
496
  'review' => [{
497
+ vocabulary: Mida::Vocabulary::Generic,
483
498
  type: 'http://data-vocabulary.org/Review-aggregate',
484
499
  id: nil,
485
500
  properties: {
@@ -580,11 +595,13 @@ describe Mida::Document, 'when run against a document using itemrefs' do
580
595
 
581
596
  it 'should return all the properties from the text with the correct values' do
582
597
  expected_results = [{
598
+ vocabulary: Mida::Vocabulary::Generic,
583
599
  type: nil,
584
600
  id: nil,
585
601
  properties: {
586
602
  'name' => ['Amanda'],
587
603
  'band' => [{
604
+ vocabulary: Mida::Vocabulary::Generic,
588
605
  type: nil,
589
606
  id: nil,
590
607
  properties: {
@@ -634,13 +651,15 @@ describe Mida::Document, 'when run against a document using multiple itemprops w
634
651
 
635
652
  it 'should return all the properties from the text with the correct values' do
636
653
  expected_results = [{
654
+ vocabulary: Mida::Vocabulary::Generic,
637
655
  type: 'icecreams',
638
656
  id: nil,
639
657
  properties: {
640
658
  'flavour' => [
641
659
  'Lemon sorbet',
642
660
  'Apricot sorbet',
643
- { type: 'icecream-type',
661
+ { vocabulary: Mida::Vocabulary::Generic,
662
+ type: 'icecream-type',
644
663
  id: nil,
645
664
  properties: {
646
665
  'fruit' => ['Strawberry'],
@@ -671,6 +690,7 @@ describe Mida::Document, 'when run against a document using an itemprop with mul
671
690
 
672
691
  it 'should return all the properties from the text with the correct values' do
673
692
  expected_results = [{
693
+ vocabulary: Mida::Vocabulary::Generic,
674
694
  type: nil,
675
695
  id: nil,
676
696
  properties: {
@@ -682,3 +702,63 @@ describe Mida::Document, 'when run against a document using an itemprop with mul
682
702
  test_parsing(@md, %r{}, expected_results)
683
703
  end
684
704
  end
705
+
706
+ describe Mida::Document, 'when run against a full html document containing an itemtype that matches a registered vocabulary' do
707
+
708
+ before do
709
+ html = '
710
+ <html><body>
711
+ There is some text here
712
+ <div>
713
+ and also some here
714
+ <div itemscope itemtype="http://data-vocabulary.org/Review">
715
+ <span itemprop="itemreviewed">Romeo Pizza</span>
716
+ Reviewed by <span itemprop="reviewer">Ulysses Grant</span> on
717
+ <time itemprop="dtreviewed" datetime="2009-01-06">Jan 6</time>.
718
+ <span itemprop="summary">Delicious, tasty pizza in Eastlake!</span>
719
+ <span itemprop="description">This is a very nice pizza place.</span>
720
+ Rating: <span itemprop="rating">4.5</span>
721
+ </div>
722
+ </div>
723
+ </body></html>
724
+ '
725
+
726
+ class Review < Mida::VocabularyDesc
727
+ itemtype %r{http://data-vocabulary.org/Review}
728
+ has_one 'itemreviewed', 'reviewer', 'dtreviewed', 'summary'
729
+ has_one 'rating', 'description'
730
+ end
731
+ Mida::Vocabulary.register(Review)
732
+
733
+ @md = Mida::Document.new(html)
734
+
735
+ end
736
+
737
+ it_should_behave_like 'one root itemscope'
738
+
739
+ it '#search should match against Review' do
740
+ @md.search(Review).size.should == 1
741
+ end
742
+
743
+ it 'should specify the correct type' do
744
+ @md.search(Review).first.type.should == 'http://data-vocabulary.org/Review'
745
+ end
746
+
747
+ it 'should return all the properties and types with the correct values' do
748
+ expected_results = [{
749
+ vocabulary: Review,
750
+ type: 'http://data-vocabulary.org/Review',
751
+ id: nil,
752
+ properties: {
753
+ 'itemreviewed' => ['Romeo Pizza'],
754
+ 'reviewer' => ['Ulysses Grant'],
755
+ 'dtreviewed' => ['2009-01-06'],
756
+ 'summary' => ['Delicious, tasty pizza in Eastlake!'],
757
+ 'description' => ['This is a very nice pizza place.'],
758
+ 'rating' => ['4.5']
759
+ }
760
+ }]
761
+ test_parsing(@md, Review, expected_results)
762
+ end
763
+
764
+ end
data/spec/item_spec.rb CHANGED
@@ -17,6 +17,10 @@ describe Mida::Item, 'when initialized with an itemscope containing just itempro
17
17
  @item = Mida::Item.new(itemscope_el)
18
18
  end
19
19
 
20
+ it '#vocabulary should return the correct vocabulary' do
21
+ @item.vocabulary.should == Mida::Vocabulary::Generic
22
+ end
23
+
20
24
  it '#type should return the correct type' do
21
25
  @item.type.should == nil
22
26
  end
@@ -34,7 +38,7 @@ describe Mida::Item, 'when initialized with an itemscope containing just itempro
34
38
 
35
39
  it '#to_h should return the correct type and properties' do
36
40
  @item.to_h.should == {
37
- type: nil, id: nil, properties: {
41
+ vocabulary: Mida::Vocabulary::Generic, type: nil, id: nil, properties: {
38
42
  'first_name' => ['Lorry'],
39
43
  'last_name' => ['Woodman']
40
44
  }
@@ -49,6 +53,10 @@ describe Mida::Item, 'when initialized with an itemscope containing just itempro
49
53
  @item = Mida::Item.new(itemscope_el)
50
54
  end
51
55
 
56
+ it '#vocabulary should return the correct vocabulary' do
57
+ @item.vocabulary.should == Mida::Vocabulary::Generic
58
+ end
59
+
52
60
  it '#type should return the correct type' do
53
61
  @item.type.should == 'person'
54
62
  end
@@ -66,6 +74,7 @@ describe Mida::Item, 'when initialized with an itemscope containing just itempro
66
74
 
67
75
  it '#to_h should return the correct type and properties' do
68
76
  @item.to_h.should == {
77
+ vocabulary: Mida::Vocabulary::Generic,
69
78
  type: 'person',
70
79
  id: nil,
71
80
  properties: {
@@ -112,6 +121,10 @@ describe Mida::Item, 'when initialized with an itemscope containing itemprops su
112
121
  @item.type.should == 'person'
113
122
  end
114
123
 
124
+ it '#vocabulary should return the correct vocabulary' do
125
+ @item.vocabulary.should == Mida::Vocabulary::Generic
126
+ end
127
+
115
128
  it '#id should return the correct id' do
116
129
  @item.id.should == nil
117
130
  end
@@ -125,6 +138,7 @@ describe Mida::Item, 'when initialized with an itemscope containing itemprops su
125
138
 
126
139
  it '#to_h should return the correct type and properties' do
127
140
  @item.to_h.should == {
141
+ vocabulary: Mida::Vocabulary::Generic,
128
142
  type: 'person',
129
143
  id: nil,
130
144
  properties: {
@@ -145,6 +159,10 @@ describe Mida::Item, "when initialized with an itemscope containing itemprops
145
159
  @item = Mida::Item.new(itemscope)
146
160
  end
147
161
 
162
+ it '#vocabulary should return the correct vocabulary' do
163
+ @item.vocabulary.should == Mida::Vocabulary::Generic
164
+ end
165
+
148
166
  it '#type should return the correct type' do
149
167
  @item.type.should == nil
150
168
  end
@@ -161,6 +179,7 @@ describe Mida::Item, "when initialized with an itemscope containing itemprops
161
179
 
162
180
  it '#to_h should return the correct type and properties' do
163
181
  @item.to_h.should == {
182
+ vocabulary: Mida::Vocabulary::Generic,
164
183
  type: nil,
165
184
  id: nil,
166
185
  properties: {
@@ -184,6 +203,10 @@ describe Mida::Item, "when initialized with an itemscope containing an itemprop
184
203
  @item = Mida::Item.new(itemscope)
185
204
  end
186
205
 
206
+ it '#vocabulary should return the correct vocabulary' do
207
+ @item.vocabulary.should == Mida::Vocabulary::Generic
208
+ end
209
+
187
210
  it '#type should return the correct type' do
188
211
  @item.type.should == nil
189
212
  end
@@ -201,6 +224,7 @@ describe Mida::Item, "when initialized with an itemscope containing an itemprop
201
224
 
202
225
  it '#to_h should return the correct type and properties' do
203
226
  @item.to_h.should == {
227
+ vocabulary: Mida::Vocabulary::Generic,
204
228
  type: nil,
205
229
  id: nil,
206
230
  properties: {
@@ -239,6 +263,10 @@ describe Mida::Item, 'when initialized with an itemscope containing itemprops wi
239
263
  @item = Mida::Item.new(icecreams)
240
264
  end
241
265
 
266
+ it '#vocabulary should return the correct vocabulary' do
267
+ @item.vocabulary.should == Mida::Vocabulary::Generic
268
+ end
269
+
242
270
  it '#type should return the correct type' do
243
271
  @item.type.should == 'icecreams'
244
272
  end
@@ -259,13 +287,15 @@ describe Mida::Item, 'when initialized with an itemscope containing itemprops wi
259
287
 
260
288
  it '#to_h should return the correct type and properties' do
261
289
  @item.to_h.should == {
290
+ vocabulary: Mida::Vocabulary::Generic,
262
291
  type: 'icecreams',
263
292
  id: nil,
264
293
  properties: {
265
294
  'flavour' => [
266
295
  'Lemon Sorbet',
267
296
  'Apricot Sorbet',
268
- { type: 'icecream-type',
297
+ { vocabulary: Mida::Vocabulary::Generic,
298
+ type: 'icecream-type',
269
299
  id: nil,
270
300
  properties: {
271
301
  'fruit' => ['Strawberry'],
@@ -306,6 +336,10 @@ describe Mida::Item, 'when initialized with an itemscope containing itemrefs' do
306
336
  @item = Mida::Item.new(age_div)
307
337
  end
308
338
 
339
+ it '#vocabulary should return the correct vocabulary' do
340
+ @item.vocabulary.should == Mida::Vocabulary::Generic
341
+ end
342
+
309
343
  it '#type should return the correct type' do
310
344
  @item.type.should == nil
311
345
  end
@@ -324,12 +358,14 @@ describe Mida::Item, 'when initialized with an itemscope containing itemrefs' do
324
358
 
325
359
  it '#to_h should return the correct type and properties' do
326
360
  @item.to_h.should == {
361
+ vocabulary: Mida::Vocabulary::Generic,
327
362
  type: nil,
328
363
  id: nil,
329
364
  properties: {
330
365
  'age' => ['30'],
331
366
  'name' => ['Amanda'],
332
367
  'band' => [{
368
+ vocabulary: Mida::Vocabulary::Generic,
333
369
  type: nil,
334
370
  id: nil,
335
371
  properties: {
@@ -358,6 +394,10 @@ describe Mida::Item, 'when initialized with an itemscope containing an itemid' d
358
394
  @item = Mida::Item.new(book)
359
395
  end
360
396
 
397
+ it '#vocabulary should return the correct vocabulary' do
398
+ @item.vocabulary.should == Mida::Vocabulary::Generic
399
+ end
400
+
361
401
  it '#type should return the correct type' do
362
402
  @item.type.should == 'book'
363
403
  end
@@ -375,6 +415,7 @@ describe Mida::Item, 'when initialized with an itemscope containing an itemid' d
375
415
 
376
416
  it '#to_h should return the correct type and properties' do
377
417
  @item.to_h.should == {
418
+ vocabulary: Mida::Vocabulary::Generic,
378
419
  type: 'book',
379
420
  id: 'urn:isbn:978-1-849510-50-9',
380
421
  properties: {
@@ -425,6 +466,10 @@ describe Mida::Item, 'when initialized with an itemscope containing itemscopes a
425
466
  before do
426
467
  end
427
468
 
469
+ it '#vocabulary should return the correct vocabulary' do
470
+ @item.vocabulary.should == Mida::Vocabulary::Generic
471
+ end
472
+
428
473
  it '#type should return the correct type' do
429
474
  @item.type.should == 'review'
430
475
  end
@@ -443,18 +488,21 @@ describe Mida::Item, 'when initialized with an itemscope containing itemscopes a
443
488
 
444
489
  it '#to_h should return the correct type and properties' do
445
490
  @item.to_h.should == {
491
+ vocabulary: Mida::Vocabulary::Generic,
446
492
  type: 'review',
447
493
  id: nil,
448
494
  properties: {
449
495
  'item_name' => ['Acme Anvil'],
450
496
  'rating' => ['5'],
451
497
  'reviewer' => [{
498
+ vocabulary: Mida::Vocabulary::Generic,
452
499
  type: 'person',
453
500
  id: nil,
454
501
  properties: {
455
502
  'first_name' => ['Lorry'],
456
503
  'last_name' => ['Woodman'],
457
504
  'represents' => [{
505
+ vocabulary: Mida::Vocabulary::Generic,
458
506
  type: 'organization',
459
507
  id: nil,
460
508
  properties: {
@@ -467,3 +515,70 @@ describe Mida::Item, 'when initialized with an itemscope containing itemscopes a
467
515
  }
468
516
  end
469
517
  end
518
+
519
+ describe Mida::Item, 'when initialized with an itemscope that matches a non-generic registered vocabulary' do
520
+ before do
521
+
522
+ class Colour < Mida::VocabularyDesc
523
+ itemtype %r{http://example.com/vocab/colour}
524
+ has_one 'red', 'green', 'blue'
525
+ end
526
+ Mida::Vocabulary.register(Colour)
527
+
528
+ class Person < Mida::VocabularyDesc
529
+ itemtype %r{http://example.com/vocab/person}
530
+ has_one 'name'
531
+ has_one 'url'
532
+ has_many 'limbs'
533
+ has_many 'favourite-colours' do
534
+ types Colour
535
+ end
536
+ end
537
+ Mida::Vocabulary.register(Person)
538
+
539
+ red = mock_element('span', {'itemprop' => 'red'}, '0xFF')
540
+ green = mock_element('span', {'itemprop' => 'green'}, '0x00')
541
+ blue = mock_element('span', {'itemprop' => 'blue'}, '0xFF')
542
+ purple = mock_element('div', {'itemscope' => true,
543
+ 'itemtype' => 'http://example.com/vocab/colour',
544
+ 'itemprop' => 'favourite-colours'},
545
+ nil, [red, green, blue])
546
+ orange = mock_element('span', {'itemprop' => 'favourite-colours'}, 'Orange')
547
+
548
+ name1 = mock_element('span', {'itemprop' => 'name'}, 'Lawrence Woodman')
549
+ name2 = mock_element('span', {'itemprop' => 'name'}, 'Lorry Woodman')
550
+ url = mock_element('a', {'itemprop' => 'url', 'href' => 'http://example.com/myhomepage'})
551
+ arm = mock_element('span', {'itemprop' => 'limbs'}, 'Arm')
552
+ leg = mock_element('span', {'itemprop' => 'limbs'}, 'Leg')
553
+ robert_wilson = mock_element('span', {'itemprop' => 'favourite-author'}, 'Robert Wilson')
554
+ itemscope_el = mock_element('div', {'itemscope' => true,
555
+ 'itemtype' =>'http://example.com/vocab/person'
556
+ }, nil, [name1, name2, url, arm, leg, purple, orange, robert_wilson])
557
+ @item = Mida::Item.new(itemscope_el, "http://example.com")
558
+ end
559
+
560
+ it '#vocabulary should return the correct vocabulary' do
561
+ @item.vocabulary.should == Person
562
+ end
563
+
564
+ it 'should reject properties that have multiple values if has_one specified' do
565
+ @item.properties.should_not have_key('name')
566
+ end
567
+
568
+ it 'should accept properties that have a single value if has_one specified' do
569
+ @item.properties['url'].should == ['http://example.com/myhomepage']
570
+ end
571
+
572
+ it 'should accept properties that have a many values if has_many specified' do
573
+ @item.properties['limbs'].should == ['Arm', 'Leg']
574
+ end
575
+
576
+ it 'should register properties using the specified types' do
577
+ @item.properties['favourite-colours'].size.should == 1
578
+ @item.properties['favourite-colours'].first.vocabulary.should == Colour
579
+ end
580
+
581
+ it 'should reject properties that are not specified' do
582
+ @item.properties.should_not have_key('favourite-author')
583
+ end
584
+ end
@@ -2,27 +2,27 @@ require_relative 'spec_helper'
2
2
  require_relative '../lib/mida'
3
3
 
4
4
 
5
- describe Mida::Property, 'when parsing an element without an itemprop attribute' do
5
+ describe Mida::Itemprop, 'when parsing an element without an itemprop attribute' do
6
6
  before do
7
7
  @element = mock_element('span')
8
8
  end
9
9
 
10
10
  it '#parse should return an empty Hash' do
11
- Mida::Property.parse(@element).should == {}
11
+ Mida::Itemprop.parse(@element).should == {}
12
12
  end
13
13
  end
14
14
 
15
- describe Mida::Property, 'when parsing an element with one itemprop name' do
15
+ describe Mida::Itemprop, 'when parsing an element with one itemprop name' do
16
16
  before do
17
17
  @element = mock_element('span', {'itemprop' => 'reviewer'}, 'Lorry Woodman')
18
18
  end
19
19
 
20
20
  it '#parse should return a Hash with the correct name/value pair' do
21
- Mida::Property.parse(@element).should == {'reviewer' => 'Lorry Woodman'}
21
+ Mida::Itemprop.parse(@element).should == {'reviewer' => 'Lorry Woodman'}
22
22
  end
23
23
  end
24
24
 
25
- describe Mida::Property, "when parsing an element who's inner text contains\
25
+ describe Mida::Itemprop, "when parsing an element who's inner text contains\
26
26
  non microdata elements" do
27
27
  before do
28
28
  html = '<span itemprop="reviewer">Lorry <em>Woodman</em></span>'
@@ -31,11 +31,11 @@ describe Mida::Property, "when parsing an element who's inner text contains\
31
31
  end
32
32
 
33
33
  it '#parse should return a Hash with the correct name/value pair' do
34
- Mida::Property.parse(@itemprop).should == {'reviewer' => 'Lorry Woodman'}
34
+ Mida::Itemprop.parse(@itemprop).should == {'reviewer' => 'Lorry Woodman'}
35
35
  end
36
36
  end
37
37
 
38
- describe Mida::Property, 'when parsing an itemscope element that has a relative url' do
38
+ describe Mida::Itemprop, 'when parsing an itemscope element that has a relative url' do
39
39
  before do
40
40
 
41
41
  # The first_name element
@@ -54,7 +54,7 @@ describe Mida::Property, 'when parsing an itemscope element that has a relative
54
54
  end
55
55
 
56
56
  it '#parse should return a Hash with the correct name/value pair' do
57
- property = Mida::Property.parse(@itemscope_el, "http://example.com")
57
+ property = Mida::Itemprop.parse(@itemscope_el, "http://example.com")
58
58
  property.size.should == 1
59
59
  reviewer = property['reviewer']
60
60
  reviewer.type.should == 'person'
@@ -66,13 +66,13 @@ describe Mida::Property, 'when parsing an itemscope element that has a relative
66
66
  end
67
67
  end
68
68
 
69
- describe Mida::Property, 'when parsing an element with multiple itemprop names' do
69
+ describe Mida::Itemprop, 'when parsing an element with multiple itemprop names' do
70
70
  before do
71
71
  @element = mock_element('span', {'itemprop' => 'reviewer friend person'}, 'the property text')
72
72
  end
73
73
 
74
74
  it '#parse should return a Hash with the name/value pairs' do
75
- Mida::Property.parse(@element).should == {
75
+ Mida::Itemprop.parse(@element).should == {
76
76
  'reviewer' => 'the property text',
77
77
  'friend' => 'the property text',
78
78
  'person' => 'the property text'
@@ -80,7 +80,7 @@ describe Mida::Property, 'when parsing an element with multiple itemprop names'
80
80
  end
81
81
  end
82
82
 
83
- describe Mida::Property, 'when parsing an element with non text content url values' do
83
+ describe Mida::Itemprop, 'when parsing an element with non text content url values' do
84
84
  before :all do
85
85
  URL_ELEMENTS = {
86
86
  'a' => 'href', 'area' => 'href',
@@ -98,7 +98,7 @@ describe Mida::Property, 'when parsing an element with non text content url valu
98
98
  url = 'register/index.html'
99
99
  URL_ELEMENTS.each do |tag, attr|
100
100
  element = mock_element(tag, {'itemprop' => 'url', attr => url})
101
- Mida::Property.parse(element).should == {'url' => ''}
101
+ Mida::Itemprop.parse(element).should == {'url' => ''}
102
102
  end
103
103
  end
104
104
 
@@ -112,7 +112,7 @@ describe Mida::Property, 'when parsing an element with non text content url valu
112
112
  urls.each do |url|
113
113
  URL_ELEMENTS.each do |tag, attr|
114
114
  element = mock_element(tag, {'itemprop' => 'url', attr => url})
115
- Mida::Property.parse(element).should == {'url' => url}
115
+ Mida::Itemprop.parse(element).should == {'url' => url}
116
116
  end
117
117
  end
118
118
  end
@@ -127,7 +127,7 @@ describe Mida::Property, 'when parsing an element with non text content url valu
127
127
  url = 'register/index.html'
128
128
  URL_ELEMENTS.each do |tag, attr|
129
129
  element = mock_element(tag, {'itemprop' => 'url', attr => url})
130
- Mida::Property.parse(element, @page_url).should ==
130
+ Mida::Itemprop.parse(element, @page_url).should ==
131
131
  {'url' => 'http://example.com/test/register/index.html'}
132
132
  end
133
133
  end
@@ -142,7 +142,7 @@ describe Mida::Property, 'when parsing an element with non text content url valu
142
142
  urls.each do |url|
143
143
  URL_ELEMENTS.each do |tag, attr|
144
144
  element = mock_element(tag, {'itemprop' => 'url', attr => url})
145
- Mida::Property.parse(element, @page_url).should == {'url' => url}
145
+ Mida::Itemprop.parse(element, @page_url).should == {'url' => url}
146
146
  end
147
147
  end
148
148
  end
@@ -150,16 +150,16 @@ describe Mida::Property, 'when parsing an element with non text content url valu
150
150
  end
151
151
  end
152
152
 
153
- describe Mida::Property, 'when parsing an element with non text content non url values' do
153
+ describe Mida::Itemprop, 'when parsing an element with non text content non url values' do
154
154
  it 'should get values from a meta content attribute' do
155
155
  element = mock_element('meta', {'itemprop' => 'reviewer',
156
156
  'content' => 'Lorry Woodman'})
157
- Mida::Property.parse(element).should == {'reviewer' => 'Lorry Woodman'}
157
+ Mida::Itemprop.parse(element).should == {'reviewer' => 'Lorry Woodman'}
158
158
  end
159
159
 
160
160
  it 'should get time from an time datatime attribute' do
161
161
  element = mock_element('time', {'itemprop' => 'dtreviewed',
162
162
  'datetime' => '2011-04-04'})
163
- Mida::Property.parse(element).should == {'dtreviewed' => '2011-04-04'}
163
+ Mida::Itemprop.parse(element).should == {'dtreviewed' => '2011-04-04'}
164
164
  end
165
165
  end
@@ -0,0 +1,106 @@
1
+ require_relative 'spec_helper'
2
+ require_relative '../lib/mida'
3
+
4
+ describe Mida::VocabularyDesc, 'when subclassed and given has statements with no blocks' do
5
+ before do
6
+ class Organization < Mida::VocabularyDesc
7
+ itemtype %r{http://example\.com.*?organization$}i
8
+ has_one 'name'
9
+ has_many 'tel', 'url'
10
+ end
11
+ end
12
+
13
+ it '#itemtype should return the correct regexp' do
14
+ Organization.itemtype.should == %r{http://example\.com.*?organization$}i
15
+ end
16
+
17
+ it 'should specify name to appear once' do
18
+ Organization.prop_spec['name'][:num].should == :one
19
+ end
20
+
21
+ it 'should specify tel and url to appear many times' do
22
+ Organization.prop_spec['tel'][:num].should == :many
23
+ Organization.prop_spec['url'][:num].should == :many
24
+ end
25
+ end
26
+
27
+ describe Mida::VocabularyDesc, 'when subclassed and given has statements with blocks' do
28
+ before do
29
+ class Rating < Mida::VocabularyDesc
30
+ itemtype %r{http://example\.com.*?rating$}i
31
+ has_one 'best', 'value'
32
+ end
33
+
34
+ class Comment < Mida::VocabularyDesc
35
+ itemtype %r{http://example\.com.*?comment$}i
36
+ has_one 'commentor', 'comment'
37
+ end
38
+
39
+ class Review < Mida::VocabularyDesc
40
+ itemtype %r{http://example\.com.*?review$}i
41
+ has_one 'itemreviewed'
42
+ has_one 'rating' do
43
+ types Rating, String
44
+ end
45
+ has_many 'comments' do
46
+ types Comment
47
+ end
48
+ end
49
+ end
50
+
51
+ it '#itemtype should return the correct regexp' do
52
+ Review.itemtype.should == %r{http://example\.com.*?review$}i
53
+ end
54
+
55
+ it 'should specify itemreviewed to appear once' do
56
+ Review.prop_spec['itemreviewed'][:num].should == :one
57
+ end
58
+
59
+ it 'should specify that itemreviewed only have the type String' do
60
+ Review.prop_spec['itemreviewed'][:types].should == [String]
61
+ end
62
+
63
+ it 'should specify rating to appear once' do
64
+ Review.prop_spec['rating'][:num].should == :one
65
+ end
66
+
67
+ it 'should specify rating to only have the types: Rating, String' do
68
+ Review.prop_spec['rating'][:types].should == [Rating, String]
69
+ end
70
+
71
+ it 'should specify comments to appear many times' do
72
+ Review.prop_spec['comments'][:num].should == :many
73
+ end
74
+
75
+ it 'should specify that comments only have the type Comment' do
76
+ Review.prop_spec['comments'][:types].should == [Comment]
77
+ end
78
+ end
79
+
80
+ describe Mida::VocabularyDesc, 'when subclassed and used with :any for properties and types' do
81
+ before do
82
+ class Person < Mida::VocabularyDesc
83
+ itemtype %r{}
84
+ has_one 'name'
85
+ has_many :any do
86
+ types :any
87
+ end
88
+ end
89
+ end
90
+
91
+ it '#itemtype should return the correct regexp' do
92
+ Person.itemtype.should == %r{}
93
+ end
94
+
95
+ it 'should specify that name only appears once' do
96
+ Person.prop_spec['name'][:num].should == :one
97
+ end
98
+
99
+ it 'should specify that any other property can appear many times' do
100
+ Person.prop_spec[:any][:num].should == :many
101
+ end
102
+
103
+ it 'should specify that any other property can have any type' do
104
+ Person.prop_spec[:any][:types].should == [:any]
105
+ end
106
+ end
metadata CHANGED
@@ -2,7 +2,7 @@
2
2
  name: mida
3
3
  version: !ruby/object:Gem::Version
4
4
  prerelease:
5
- version: 0.1.3
5
+ version: 0.2.0
6
6
  platform: ruby
7
7
  authors:
8
8
  - Lawrence Woodman
@@ -10,7 +10,7 @@ autorequire:
10
10
  bindir: bin
11
11
  cert_chain: []
12
12
 
13
- date: 2011-04-18 00:00:00 Z
13
+ date: 2011-05-03 00:00:00 Z
14
14
  dependencies:
15
15
  - !ruby/object:Gem::Dependency
16
16
  name: nokogiri
@@ -46,13 +46,17 @@ extra_rdoc_files:
46
46
  - CHANGELOG.rdoc
47
47
  files:
48
48
  - lib/mida.rb
49
- - lib/mida/property.rb
50
49
  - lib/mida/item.rb
51
50
  - lib/mida/document.rb
52
- - spec/property_spec.rb
51
+ - lib/mida/vocabulary/generic.rb
52
+ - lib/mida/vocabularydesc.rb
53
+ - lib/mida/itemprop.rb
54
+ - lib/mida/vocabulary.rb
55
+ - spec/itemprop_spec.rb
53
56
  - spec/document_spec.rb
54
57
  - spec/item_spec.rb
55
58
  - spec/spec_helper.rb
59
+ - spec/vocabularydesc_spec.rb
56
60
  - TODO.rdoc
57
61
  - CHANGELOG.rdoc
58
62
  - README.rdoc