mofo 0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,2 @@
1
+ = 0.1
2
+ - First release.
data/LICENSE ADDED
@@ -0,0 +1,18 @@
1
+ Copyright (c) 2006 Chris Wanstrath
2
+
3
+ Permission is hereby granted, free of charge, to any person obtaining a copy of
4
+ this software and associated documentation files (the "Software"), to deal in
5
+ the Software without restriction, including without limitation the rights to
6
+ use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of
7
+ the Software, and to permit persons to whom the Software is furnished to do so,
8
+ subject to the following conditions:
9
+
10
+ The above copyright notice and this permission notice shall be included in all
11
+ copies or substantial portions of the Software.
12
+
13
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
14
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS
15
+ FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR
16
+ COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER
17
+ IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
18
+ CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
data/README ADDED
@@ -0,0 +1,187 @@
1
+ mofo.
2
+ - a ruby microformat parser -
3
+ engine
4
+ dsl
5
+ helper
6
+ toy
7
+
8
+ = Get Started Immediately
9
+
10
+ $ irb -rubygems
11
+ >> require 'mofo'
12
+ => true
13
+
14
+ >> fireball = HCard.find 'http://flickr.com/people/gruber/'
15
+ => #<HCard:0x6db898 ... >
16
+
17
+ >> fireball.nickname
18
+ => "gruber"
19
+
20
+ >> fireball.url
21
+ => "http://daringfireball.net/"
22
+
23
+ >> fireball.n.family_name
24
+ => "Gruber"
25
+
26
+ >> fireball.title
27
+ => "Raconteur"
28
+
29
+ >> fireball.adr.locality
30
+ => "Philadelphia"
31
+
32
+ >> fireball.logo
33
+ => "http://static.flickr.com/9/buddyicons/44621776@N00.jpg?1117572751"
34
+
35
+ = Microwhozit?
36
+
37
+ Microformats are tiny little markup definitions built on top of, usually,
38
+ HTML or XHTML.
39
+
40
+ You have a blog. You have recent posts on your blog's index page. You have
41
+ an Atom feed. You have recent posts on your blog's Atom feed. See where I'm
42
+ going with this?
43
+
44
+ The hAtom microformat (or uformat) can be embedded in your existing HTML by
45
+ setting CSS classes with semantic meaning inside of your posts. A class to signify
46
+ a post is contained within this div, a class to signify the contents of this
47
+ h3 are the post's title, a class to signify the contents of this span is the
48
+ blog post's author, etc.
49
+
50
+ You can then use a microformat parser (like, say, mofo) to extract this information
51
+ as you would from an Atom feed. Hell, you can even convert hAtom to Atom. It's an
52
+ insta-feed! No extra code required!
53
+
54
+ You're already doing the work, you see. Microformats are everywhere. We just need
55
+ to set them free.
56
+
57
+ Check it:
58
+
59
+ <div class="post">
60
+ <h3>Megadeth Show Last Night</h3>
61
+ <span class="subtitle">Posted by Chris on June 4th</span>
62
+ <div class="content">Went to a show last night. Megadeth. It was alright.</div>
63
+ </div>
64
+
65
+ Right? Normal. Here's the same post marked up with hAtom:
66
+
67
+ <div class="post hentry">
68
+ <h3 class="entry-title">Megadeth Show Last Night</h3>
69
+ <span class="subtitle">Posted by <span class="author vcard fn">Chris</span> on
70
+ <abbr class="updated" title="2006-06-04T10:32:10Z">June 4th</abbr></span>
71
+ <div class="content entry-content">Went to a show last night. Megadeth. It was alright.</div>
72
+ </div>
73
+
74
+ All I did was add the hentry, entry-title, and entry-content classes to existing containers. Then I
75
+ went ahead and wrapped the date in an <abbr> tag giving it a title in the microformat-standard way. Finally
76
+ I put a div around Chris signifying it as the author field of the hEntry and making it a valid hCard by
77
+ including the vcard and fn classes. It's really not all that hard. Did I mess it up? Maybe, but I'm sure I got
78
+ close. And I didn't even use a reference. Practice.
79
+
80
+ How'd we parse this, tho?
81
+
82
+ $ irb -rubygems
83
+ >> require 'mofo'
84
+ => true
85
+
86
+ >> post = HEntry.find 'http://milesofstyle.org/posts/351-megadeth-show-last-night.html'
87
+ => #<HEntry:0x6db898 ... >
88
+
89
+ >> post.entry_title
90
+ => "Megadeth Show Last Night"
91
+
92
+ >> post.properties
93
+ => ["entry_content", "updated", "author", "entry_title"]
94
+
95
+ >> post.updated
96
+ => Sun Jun 04 10:32:10 UTC 2006
97
+
98
+ >> post.updated.class
99
+ => Time
100
+
101
+ >> post.author
102
+ => #<HCard:0x6e7b98 @properties=["fn"], @fn="Chris">
103
+
104
+ >> post.author.fn
105
+ => "Chris"
106
+
107
+ >> post.entry_content
108
+ => "Went to a show last night. Megadeth. It was alright."
109
+
110
+ That's, like, stupid easy. If HEntry.find gets back more than one hEntry, you'll get an array.
111
+
112
+ = Mofo#find
113
+
114
+ Everything revolves around the #find method. Sound familiar? Yeah.
115
+
116
+ >> Microformat.find "http://valid-url.com"
117
+ >> Microformat.find "/path/to/existing/file"
118
+ >> Microformat.find :text => "microformat text"
119
+
120
+ Also, #find can be told explicitly to find all (returning an array on failure) or only find
121
+ the first (returning nil on failure).
122
+
123
+ >> Microformat.find :all => "/existing/file"
124
+ => [ array of microformat objects ]
125
+
126
+ >> Microformat.find :first => "/existing/file"
127
+ => microformat object
128
+
129
+ >> Microformat.find "/existing/file"
130
+ => either an array of objects or just one object
131
+
132
+ :all and :first go outside of :text.
133
+
134
+ >> Microformat.find :all => { :text => 'mfin text' }
135
+
136
+ That's it. Some microformats take specific options.
137
+
138
+ = Microformats
139
+
140
+ Here are the currently implemented microformats, along with a site you
141
+ can use them on today. We want more, better, faster, stat.
142
+
143
+ formats:
144
+ - hCard [ flickr profiles ]
145
+ - hCalendar [ upcoming.org ]
146
+ - hReview [ cork'd reviews ]
147
+ - hEntry [ err the blog posts ]
148
+ - xoxo [ chowhound.com ]
149
+
150
+ patterns:
151
+ - rel-tag
152
+ - rel-bookmark
153
+
154
+ Here are the microformats we want soon soon soon:
155
+ - geo
156
+ - hResume
157
+
158
+ patterns:
159
+ - include-pattern
160
+
161
+ = Ruby on Rails
162
+
163
+ mofo doubles as a Rails plugin. Just drop it into vendor/plugins and you are good to go, with all the
164
+ available microformat parsers loaded into your application.
165
+
166
+ mofo classes are YAML and Marshal approved. This means you can cache them with DRb or memcached, or store
167
+ them in a session.
168
+
169
+ = More Info
170
+
171
+ >> http://microformats.org/
172
+ => "The homepage, check"
173
+ >> http://microformats.org/wiki/
174
+ => "The wiki, check"
175
+ >> http://blog.labnotes.org/category/microformats/
176
+ => "Assaf Arkin knows his MFin' stuff"
177
+ >> http://allinthehead.com/
178
+ => "Drew McClellan, Microformat wizard"
179
+ >> http://mofo.rubyforge.org/
180
+ => "mofo HQ"
181
+
182
+ = Other Parsers
183
+
184
+ >> Scrapi
185
+ => http://rubyforge.org/projects/scrapi/
186
+ >> uformats
187
+ => http://rubyforge.org/projects/uformats
@@ -0,0 +1,199 @@
1
+ %w[rubygems hpricot microformat/string open-uri ostruct timeout].each { |f| require f }
2
+ require_gem 'hpricot', '>= 0.4.59'
3
+
4
+ class Microformat
5
+ module Base
6
+ ##
7
+ # The Gateway
8
+ #
9
+ def find(*args)
10
+ target, @options = args
11
+ @options ||= Hash === target ? target : {}
12
+ [:first, :all].each { |key| target = @options[key] if @options[key] }
13
+
14
+ doc = build_doc(@options[:text] ? @options : target)
15
+
16
+ microformats = find_occurances(doc)
17
+ raise MicroformatNotFound if @options[:strict] && microformats.empty?
18
+ return @options[:first] ? nil : [] if microformats.empty?
19
+
20
+ if @options[:first] || @options[:all]
21
+ return @options[:first] ? find_first(microformats) : find_every(microformats)
22
+ end
23
+
24
+ object = find_every(microformats)
25
+ case object.size
26
+ when 1 then object.first
27
+ when 0 then nil
28
+ else object
29
+ end
30
+ end
31
+
32
+ protected
33
+ ##
34
+ # DSL Related
35
+ #
36
+ def inherited(klass)
37
+ klass.instance_variable_set("@container", klass.name.downcase)
38
+ klass.instance_variable_set("@attributes", Hash.new([]))
39
+ end
40
+
41
+ def collector
42
+ collector = Hash.new([])
43
+ def collector.method_missing(method, *classes)
44
+ super unless %w[one many].include? method.to_s
45
+ self[method] += Microformat.send(:break_out_hashes, classes)
46
+ end
47
+ collector
48
+ end
49
+
50
+ def container(container)
51
+ @container = container.to_s
52
+ end
53
+
54
+ def method_missing(method, *args, &block)
55
+ super unless %w[one many].include? method.to_s
56
+ (collected = collector).instance_eval(&block) if block_given?
57
+ classes = block_given? ? [args.first => collected] : break_out_hashes(args)
58
+ @attributes[method] += classes
59
+ end
60
+
61
+ def break_out_hashes(array)
62
+ array.inject([]) do |memo, element|
63
+ memo + (Hash === element ? [element.map { |k,v| { k => v } }].flatten : [element])
64
+ end
65
+ end
66
+
67
+ def aliases(hash)
68
+ define_method(hash.keys.first) do
69
+ send(hash[hash.keys.first])
70
+ end
71
+ end
72
+
73
+ ##
74
+ # The Functionality
75
+ #
76
+ def find_first(doc)
77
+ build_class(doc.first)
78
+ end
79
+
80
+ def find_every(doc)
81
+ doc.inject([]) do |array, entry|
82
+ array + [build_class(entry)]
83
+ end
84
+ end
85
+
86
+ def build_doc(source)
87
+ case source
88
+ when String, File, StringIO
89
+ result = ''
90
+ Timeout::timeout(3) { result = open(source) }
91
+ Hpricot(result)
92
+ when Hpricot, Hpricot::Elements
93
+ source
94
+ when Hash
95
+ Hpricot(source[:text]) if source[:text]
96
+ end
97
+ end
98
+
99
+ def find_occurances(doc)
100
+ doc/".#{@container}"
101
+ end
102
+
103
+ def build_class(microformat)
104
+ hash = build_hash(microformat)
105
+ class_eval { attr_reader *(hash.keys << :properties) }
106
+ klass = new
107
+ klass.instance_variable_set("@properties", hash.keys.map { |i| i.to_s } )
108
+ hash.each do |key, value|
109
+ klass.instance_variable_set("@#{key}", prepare_value(value) )
110
+ end
111
+ klass
112
+ end
113
+
114
+ def build_hash(doc, attributes = @attributes)
115
+ hash = {}
116
+
117
+ [:one, :many].each do |name|
118
+ attributes[name].each do |attribute|
119
+ is_hash = Hash === attribute
120
+ key = is_hash ? attribute.keys.first : attribute
121
+
122
+ # rel="bookmark" pattern
123
+ if bookmark = extract_bookmark(doc)
124
+ hash[:bookmark] = bookmark
125
+ end
126
+
127
+ found = doc/".#{key.no_bang.to_s.gsub('_','-')}"
128
+ raise InvalidMicroformat if found.empty? && key.to_s =~ /!/
129
+ next if found.empty?
130
+
131
+ if is_hash && Hash === attribute[key]
132
+ built_hash = build_hash(found, attribute[key])
133
+ key = key.no_bang
134
+ if built_hash.size.zero? && found.size.nonzero?
135
+ hash[key] = found.map { |f| parse_element(f) }
136
+ hash[key] = hash[key].first if name == :one
137
+ else
138
+ hash[key] = built_hash
139
+ end
140
+ else
141
+ target = is_hash ? attribute[key] : nil
142
+ key = key.no_bang
143
+ if name == :many
144
+ hash[key] ||= []
145
+ hash[key] += found.map { |f| parse_element(f, target) }
146
+ else
147
+ hash[key] = parse_element(found.first, target)
148
+ end
149
+ end
150
+ hash[key] = hash[key].first if Array === hash[key] && hash[key].size == 1
151
+ end
152
+ end
153
+
154
+ hash
155
+ end
156
+
157
+ def extract_bookmark(doc)
158
+ bookmark = doc.search("[@rel='bookmark']").first rescue nil
159
+ bookmark.attributes['href'] if bookmark.respond_to? :attributes
160
+ end
161
+
162
+ def parse_element(element, target = nil)
163
+ if target == :url
164
+ case element.name
165
+ when 'img' then element['src']
166
+ when 'a' then element['href']
167
+ when 'object' then element['value']
168
+ end
169
+ elsif Class === target
170
+ target.find(:first => element)
171
+ else
172
+ case element.name
173
+ when 'abbr' then element['title']
174
+ when 'img' then element['alt']
175
+ else element.innerHTML
176
+ end.strip.strip_html.coerce
177
+ end
178
+ end
179
+
180
+ def prepare_value(value)
181
+ Hash === value ? OpenStruct.new(value) : value
182
+ end
183
+ end
184
+
185
+ def method_missing(method, *args, &block)
186
+ return super(method, *args, &block) unless method == :properties || @properties.include?(method.to_s)
187
+ self.class.class_eval { define_method(method) { instance_variable_get("@#{method}") } }
188
+ instance_variable_get("@#{method}")
189
+ end
190
+
191
+ extend Base
192
+ end
193
+
194
+ class InvalidMicroformat < Exception; end
195
+ class MicroformatNotFound < Exception; end
196
+
197
+ # type & id are used a lot in uformats and deprecated in ruby. no loss.
198
+ OpenStruct.class_eval { undef :type, :id }
199
+ Symbol.class_eval { def no_bang() to_s.sub('!','').to_sym end }
@@ -0,0 +1,28 @@
1
+ require 'microformat'
2
+
3
+ class Microformat::Simple < String
4
+ extend Microformat::Base
5
+
6
+ class << self
7
+ def find_first(doc)
8
+ find_every(doc).first
9
+ end
10
+
11
+ def build_class(tags)
12
+ @from.inject([]) do |array, (key, value)|
13
+ tags.each_child do |tag|
14
+ next unless tag.respond_to? :attributes
15
+ array << new(tag.innerHTML) if tag.attributes[key.to_s] == value.to_s
16
+ end
17
+ array
18
+ end
19
+ end
20
+
21
+ def from(options)
22
+ @from ||= {}
23
+ options.each do |tag, value|
24
+ @from[tag] = value
25
+ end
26
+ end
27
+ end
28
+ end
@@ -0,0 +1,40 @@
1
+ require 'generator'
2
+ require 'date'
3
+ require 'time'
4
+
5
+ # http://project.ioni.st/post/925#post-925
6
+ class String
7
+ def coerce
8
+ attempt = nil
9
+ while coercions.next?
10
+ attempt = coercions.next
11
+ break if !attempt.nil?
12
+ end
13
+ %w[@coercions @generator].each { |i| remove_instance_variable i }
14
+ attempt.nil? ? self : attempt
15
+ end
16
+
17
+ def strip_html
18
+ gsub(/<(?:[^>'"]*|(['"]).*?\1)*>/,'')
19
+ end
20
+
21
+ private
22
+ def coercions
23
+ @coercions ||= Generator.new do |@generator|
24
+ try { self == 'true' }
25
+ try { [self == 'false', false] }
26
+ try { [Date.parse(self), Time.parse(self)] }
27
+ try { Integer(self) }
28
+ try { Float(self) }
29
+ end
30
+ end
31
+
32
+ def try
33
+ attempt, desired = yield
34
+ if attempt
35
+ @generator.yield(desired.nil? ? attempt : desired)
36
+ end
37
+ rescue ArgumentError
38
+ @generator.yield nil
39
+ end
40
+ end
@@ -0,0 +1,3 @@
1
+ $:.unshift File.join(File.dirname(__FILE__), "lib"), File.join(File.dirname(__FILE__))
2
+
3
+ %w[hentry hreview hcalendar rel_tag hcard xoxo].each { |format| require "mofo/#{format}" }
@@ -0,0 +1,12 @@
1
+ # => http://microformats.org/wiki/hcalendar
2
+ require 'microformat'
3
+
4
+ class HCalendar < Microformat
5
+ container :vevent
6
+
7
+ one :class, :description, :dtend, :dtstamp, :dtstart,
8
+ :duration, :location, :status, :summary, :uid,
9
+ :last_modified, :url => :url
10
+
11
+ many :category
12
+ end
@@ -0,0 +1,40 @@
1
+ # => http://microformats.org/wiki/hcard
2
+ require 'microformat'
3
+
4
+ class HCard < Microformat
5
+ container :vcard
6
+
7
+ one :fn, :bday, :tz, :sort_string, :uid, :class
8
+ many :label, :sound, :title, :role, :key,
9
+ :mailer, :rev, :nickname, :category, :note,
10
+ :logo => :url, :url => :url, :photo => :url
11
+
12
+ one :n do
13
+ one :family_name, :given_name, :additional_name
14
+ many :honorific_prefix, :honorific_suffix
15
+ end
16
+
17
+ many :email do
18
+ many :type
19
+ many :value
20
+ end
21
+
22
+ many :tel do
23
+ many :type
24
+ many :value
25
+ end
26
+
27
+ many :adr do
28
+ one :post_office_box, :extended_address, :street_address,
29
+ :locality, :region, :postal_code, :country_name, :value
30
+ many :type
31
+ end
32
+
33
+ one :geo do
34
+ one :latitude, :longitude
35
+ end
36
+
37
+ many :org do
38
+ one :organization_name, :organization_unit
39
+ end
40
+ end
@@ -0,0 +1,11 @@
1
+ # => http://microformats.org/wiki/hatom
2
+ require 'microformat'
3
+ require 'mofo/hcard'
4
+ require 'mofo/rel_tag'
5
+
6
+ class HEntry < Microformat
7
+ one :entry_title!, :entry_summary, :updated, :published,
8
+ :author => HCard
9
+
10
+ many :entry_content, :tags => RelTag
11
+ end
@@ -0,0 +1,6 @@
1
+ # => http://microformats.org/wiki/hatom
2
+ require 'mofo/hentry'
3
+
4
+ class HFeed < Microformat
5
+ many :hentry => HEntry
6
+ end
@@ -0,0 +1,16 @@
1
+ # => http://microformats.org/wiki/hreview
2
+ require 'microformat'
3
+ require 'mofo/hcard'
4
+ require 'mofo/rel_tag'
5
+
6
+ class HReview < Microformat
7
+ one :version, :summary, :type, :dtreviewed, :rating, :description
8
+
9
+ one :reviewer => HCard
10
+
11
+ one :item! do
12
+ one :fn
13
+ end
14
+
15
+ many :tags => RelTag
16
+ end
@@ -0,0 +1,7 @@
1
+ # => http://microformats.org/wiki/rel-design-pattern
2
+ require 'microformat/simple'
3
+
4
+ class RelTag < Microformat::Simple
5
+ container :tags
6
+ from :rel => :tag
7
+ end
@@ -0,0 +1,56 @@
1
+ $:.unshift 'lib'
2
+ require 'microformat'
3
+
4
+ class XOXO < Microformat
5
+ class << self
6
+ PARENTS = %w[ol ul]
7
+ CHILDREN = %w[li]
8
+
9
+ xpath_build = proc { |element| element.map { |e| "/#{e}" } * ' | ' }
10
+ CHILDREN_XPATH = xpath_build.call(CHILDREN)
11
+ PARENTS_XPATH = xpath_build.call(PARENTS)
12
+
13
+ def find_first(doc)
14
+ find_every(doc).first
15
+ end
16
+
17
+ def find_every(doc)
18
+ tree = []
19
+ doc.each do |child|
20
+ tree << build_tree(child)
21
+ end
22
+ tree
23
+ end
24
+
25
+ def find_occurances(doc)
26
+ @options[:class] ? doc/".xoxo" : doc.search(PARENTS_XPATH)
27
+ end
28
+
29
+ def build_tree(child)
30
+ tree = []
31
+ child.search(CHILDREN_XPATH) do |element|
32
+ label, branch = nil, nil
33
+ element.children.each do |inner|
34
+ label ||= build_label(inner) unless inner.elem? && PARENTS.include?(inner.name)
35
+ branch ||= build_tree(inner) if inner.elem? && PARENTS.include?(inner.name)
36
+ end
37
+ tree << (branch ? { label => branch } : label)
38
+ end
39
+ tree
40
+ end
41
+
42
+ def build_label(node)
43
+ if node.elem?
44
+ label = Label.new(node.innerHTML.strip)
45
+ label.url = node['href'] if node.name == 'a'
46
+ label
47
+ elsif node.text? && !node.to_s.strip.empty?
48
+ node.to_s.strip
49
+ end
50
+ end
51
+ end
52
+
53
+ class Label < String
54
+ attr_accessor :url
55
+ end
56
+ end
metadata ADDED
@@ -0,0 +1,61 @@
1
+ --- !ruby/object:Gem::Specification
2
+ rubygems_version: 0.9.0
3
+ specification_version: 1
4
+ name: mofo
5
+ version: !ruby/object:Gem::Version
6
+ version: "0.1"
7
+ date: 2006-11-09 00:00:00 -08:00
8
+ summary: mofo is a ruby microformat parser
9
+ require_paths:
10
+ - lib
11
+ email: chris[at]ozmm[dot]org
12
+ homepage: http://mofo.rubyforge.org/
13
+ rubyforge_project:
14
+ description: mofo is a ruby microformat parser
15
+ autorequire: mofo
16
+ default_executable:
17
+ bindir: bin
18
+ has_rdoc: false
19
+ required_ruby_version: !ruby/object:Gem::Version::Requirement
20
+ requirements:
21
+ - - ">"
22
+ - !ruby/object:Gem::Version
23
+ version: 0.0.0
24
+ version:
25
+ platform: ruby
26
+ signing_key:
27
+ cert_chain:
28
+ post_install_message:
29
+ authors:
30
+ - Chris Wanstrath
31
+ files:
32
+ - README
33
+ - CHANGELOG
34
+ - LICENSE
35
+ - lib/microformat
36
+ - lib/microformat.rb
37
+ - lib/mofo
38
+ - lib/mofo.rb
39
+ - lib/microformat/simple.rb
40
+ - lib/microformat/string.rb
41
+ - lib/mofo/hcalendar.rb
42
+ - lib/mofo/hcard.rb
43
+ - lib/mofo/hentry.rb
44
+ - lib/mofo/hfeed.rb
45
+ - lib/mofo/hreview.rb
46
+ - lib/mofo/rel_tag.rb
47
+ - lib/mofo/xoxo.rb
48
+ test_files: []
49
+
50
+ rdoc_options: []
51
+
52
+ extra_rdoc_files: []
53
+
54
+ executables: []
55
+
56
+ extensions: []
57
+
58
+ requirements: []
59
+
60
+ dependencies: []
61
+