mofo 0.1

Sign up to get free protection for your applications and to get access to all the features.
@@ -0,0 +1,2 @@
1
+ = 0.1
2
+ - First release.
data/LICENSE ADDED
@@ -0,0 +1,18 @@
1
+ Copyright (c) 2006 Chris Wanstrath
2
+
3
+ Permission is hereby granted, free of charge, to any person obtaining a copy of
4
+ this software and associated documentation files (the "Software"), to deal in
5
+ the Software without restriction, including without limitation the rights to
6
+ use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of
7
+ the Software, and to permit persons to whom the Software is furnished to do so,
8
+ subject to the following conditions:
9
+
10
+ The above copyright notice and this permission notice shall be included in all
11
+ copies or substantial portions of the Software.
12
+
13
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
14
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS
15
+ FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR
16
+ COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER
17
+ IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
18
+ CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
data/README ADDED
@@ -0,0 +1,187 @@
1
+ mofo.
2
+ - a ruby microformat parser -
3
+ engine
4
+ dsl
5
+ helper
6
+ toy
7
+
8
+ = Get Started Immediately
9
+
10
+ $ irb -rubygems
11
+ >> require 'mofo'
12
+ => true
13
+
14
+ >> fireball = HCard.find 'http://flickr.com/people/gruber/'
15
+ => #<HCard:0x6db898 ... >
16
+
17
+ >> fireball.nickname
18
+ => "gruber"
19
+
20
+ >> fireball.url
21
+ => "http://daringfireball.net/"
22
+
23
+ >> fireball.n.family_name
24
+ => "Gruber"
25
+
26
+ >> fireball.title
27
+ => "Raconteur"
28
+
29
+ >> fireball.adr.locality
30
+ => "Philadelphia"
31
+
32
+ >> fireball.logo
33
+ => "http://static.flickr.com/9/buddyicons/44621776@N00.jpg?1117572751"
34
+
35
+ = Microwhozit?
36
+
37
+ Microformats are tiny little markup definitions built on top of, usually,
38
+ HTML or XHTML.
39
+
40
+ You have a blog. You have recent posts on your blog's index page. You have
41
+ an Atom feed. You have recent posts on your blog's Atom feed. See where I'm
42
+ going with this?
43
+
44
+ The hAtom microformat (or uformat) can be embedded in your existing HTML by
45
+ setting CSS classes with semantic meaning inside of your posts. A class to signify
46
+ a post is contained within this div, a class to signify the contents of this
47
+ h3 are the post's title, a class to signify the contents of this span is the
48
+ blog post's author, etc.
49
+
50
+ You can then use a microformat parser (like, say, mofo) to extract this information
51
+ as you would from an Atom feed. Hell, you can even convert hAtom to Atom. It's an
52
+ insta-feed! No extra code required!
53
+
54
+ You're already doing the work, you see. Microformats are everywhere. We just need
55
+ to set them free.
56
+
57
+ Check it:
58
+
59
+ <div class="post">
60
+ <h3>Megadeth Show Last Night</h3>
61
+ <span class="subtitle">Posted by Chris on June 4th</span>
62
+ <div class="content">Went to a show last night. Megadeth. It was alright.</div>
63
+ </div>
64
+
65
+ Right? Normal. Here's the same post marked up with hAtom:
66
+
67
+ <div class="post hentry">
68
+ <h3 class="entry-title">Megadeth Show Last Night</h3>
69
+ <span class="subtitle">Posted by <span class="author vcard fn">Chris</span> on
70
+ <abbr class="updated" title="2006-06-04T10:32:10Z">June 4th</abbr></span>
71
+ <div class="content entry-content">Went to a show last night. Megadeth. It was alright.</div>
72
+ </div>
73
+
74
+ All I did was add the hentry, entry-title, and entry-content classes to existing containers. Then I
75
+ went ahead and wrapped the date in an <abbr> tag giving it a title in the microformat-standard way. Finally
76
+ I put a div around Chris signifying it as the author field of the hEntry and making it a valid hCard by
77
+ including the vcard and fn classes. It's really not all that hard. Did I mess it up? Maybe, but I'm sure I got
78
+ close. And I didn't even use a reference. Practice.
79
+
80
+ How'd we parse this, tho?
81
+
82
+ $ irb -rubygems
83
+ >> require 'mofo'
84
+ => true
85
+
86
+ >> post = HEntry.find 'http://milesofstyle.org/posts/351-megadeth-show-last-night.html'
87
+ => #<HEntry:0x6db898 ... >
88
+
89
+ >> post.entry_title
90
+ => "Megadeth Show Last Night"
91
+
92
+ >> post.properties
93
+ => ["entry_content", "updated", "author", "entry_title"]
94
+
95
+ >> post.updated
96
+ => Sun Jun 04 10:32:10 UTC 2006
97
+
98
+ >> post.updated.class
99
+ => Time
100
+
101
+ >> post.author
102
+ => #<HCard:0x6e7b98 @properties=["fn"], @fn="Chris">
103
+
104
+ >> post.author.fn
105
+ => "Chris"
106
+
107
+ >> post.entry_content
108
+ => "Went to a show last night. Megadeth. It was alright."
109
+
110
+ That's, like, stupid easy. If HEntry.find gets back more than one hEntry, you'll get an array.
111
+
112
+ = Mofo#find
113
+
114
+ Everything revolves around the #find method. Sound familiar? Yeah.
115
+
116
+ >> Microformat.find "http://valid-url.com"
117
+ >> Microformat.find "/path/to/existing/file"
118
+ >> Microformat.find :text => "microformat text"
119
+
120
+ Also, #find can be told explicitly to find all (returning an array on failure) or only find
121
+ the first (returning nil on failure).
122
+
123
+ >> Microformat.find :all => "/existing/file"
124
+ => [ array of microformat objects ]
125
+
126
+ >> Microformat.find :first => "/existing/file"
127
+ => microformat object
128
+
129
+ >> Microformat.find "/existing/file"
130
+ => either an array of objects or just one object
131
+
132
+ :all and :first go outside of :text.
133
+
134
+ >> Microformat.find :all => { :text => 'mfin text' }
135
+
136
+ That's it. Some microformats take specific options.
137
+
138
+ = Microformats
139
+
140
+ Here are the currently implemented microformats, along with a site you
141
+ can use them on today. We want more, better, faster, stat.
142
+
143
+ formats:
144
+ - hCard [ flickr profiles ]
145
+ - hCalendar [ upcoming.org ]
146
+ - hReview [ cork'd reviews ]
147
+ - hEntry [ err the blog posts ]
148
+ - xoxo [ chowhound.com ]
149
+
150
+ patterns:
151
+ - rel-tag
152
+ - rel-bookmark
153
+
154
+ Here are the microformats we want soon soon soon:
155
+ - geo
156
+ - hResume
157
+
158
+ patterns:
159
+ - include-pattern
160
+
161
+ = Ruby on Rails
162
+
163
+ mofo doubles as a Rails plugin. Just drop it into vendor/plugins and you are good to go, with all the
164
+ available microformat parsers loaded into your application.
165
+
166
+ mofo classes are YAML and Marshal approved. This means you can cache them with DRb or memcached, or store
167
+ them in a session.
168
+
169
+ = More Info
170
+
171
+ >> http://microformats.org/
172
+ => "The homepage, check"
173
+ >> http://microformats.org/wiki/
174
+ => "The wiki, check"
175
+ >> http://blog.labnotes.org/category/microformats/
176
+ => "Assaf Arkin knows his MFin' stuff"
177
+ >> http://allinthehead.com/
178
+ => "Drew McClellan, Microformat wizard"
179
+ >> http://mofo.rubyforge.org/
180
+ => "mofo HQ"
181
+
182
+ = Other Parsers
183
+
184
+ >> Scrapi
185
+ => http://rubyforge.org/projects/scrapi/
186
+ >> uformats
187
+ => http://rubyforge.org/projects/uformats
@@ -0,0 +1,199 @@
1
+ %w[rubygems hpricot microformat/string open-uri ostruct timeout].each { |f| require f }
2
+ require_gem 'hpricot', '>= 0.4.59'
3
+
4
+ class Microformat
5
+ module Base
6
+ ##
7
+ # The Gateway
8
+ #
9
+ def find(*args)
10
+ target, @options = args
11
+ @options ||= Hash === target ? target : {}
12
+ [:first, :all].each { |key| target = @options[key] if @options[key] }
13
+
14
+ doc = build_doc(@options[:text] ? @options : target)
15
+
16
+ microformats = find_occurances(doc)
17
+ raise MicroformatNotFound if @options[:strict] && microformats.empty?
18
+ return @options[:first] ? nil : [] if microformats.empty?
19
+
20
+ if @options[:first] || @options[:all]
21
+ return @options[:first] ? find_first(microformats) : find_every(microformats)
22
+ end
23
+
24
+ object = find_every(microformats)
25
+ case object.size
26
+ when 1 then object.first
27
+ when 0 then nil
28
+ else object
29
+ end
30
+ end
31
+
32
+ protected
33
+ ##
34
+ # DSL Related
35
+ #
36
+ def inherited(klass)
37
+ klass.instance_variable_set("@container", klass.name.downcase)
38
+ klass.instance_variable_set("@attributes", Hash.new([]))
39
+ end
40
+
41
+ def collector
42
+ collector = Hash.new([])
43
+ def collector.method_missing(method, *classes)
44
+ super unless %w[one many].include? method.to_s
45
+ self[method] += Microformat.send(:break_out_hashes, classes)
46
+ end
47
+ collector
48
+ end
49
+
50
+ def container(container)
51
+ @container = container.to_s
52
+ end
53
+
54
+ def method_missing(method, *args, &block)
55
+ super unless %w[one many].include? method.to_s
56
+ (collected = collector).instance_eval(&block) if block_given?
57
+ classes = block_given? ? [args.first => collected] : break_out_hashes(args)
58
+ @attributes[method] += classes
59
+ end
60
+
61
+ def break_out_hashes(array)
62
+ array.inject([]) do |memo, element|
63
+ memo + (Hash === element ? [element.map { |k,v| { k => v } }].flatten : [element])
64
+ end
65
+ end
66
+
67
+ def aliases(hash)
68
+ define_method(hash.keys.first) do
69
+ send(hash[hash.keys.first])
70
+ end
71
+ end
72
+
73
+ ##
74
+ # The Functionality
75
+ #
76
+ def find_first(doc)
77
+ build_class(doc.first)
78
+ end
79
+
80
+ def find_every(doc)
81
+ doc.inject([]) do |array, entry|
82
+ array + [build_class(entry)]
83
+ end
84
+ end
85
+
86
+ def build_doc(source)
87
+ case source
88
+ when String, File, StringIO
89
+ result = ''
90
+ Timeout::timeout(3) { result = open(source) }
91
+ Hpricot(result)
92
+ when Hpricot, Hpricot::Elements
93
+ source
94
+ when Hash
95
+ Hpricot(source[:text]) if source[:text]
96
+ end
97
+ end
98
+
99
+ def find_occurances(doc)
100
+ doc/".#{@container}"
101
+ end
102
+
103
+ def build_class(microformat)
104
+ hash = build_hash(microformat)
105
+ class_eval { attr_reader *(hash.keys << :properties) }
106
+ klass = new
107
+ klass.instance_variable_set("@properties", hash.keys.map { |i| i.to_s } )
108
+ hash.each do |key, value|
109
+ klass.instance_variable_set("@#{key}", prepare_value(value) )
110
+ end
111
+ klass
112
+ end
113
+
114
+ def build_hash(doc, attributes = @attributes)
115
+ hash = {}
116
+
117
+ [:one, :many].each do |name|
118
+ attributes[name].each do |attribute|
119
+ is_hash = Hash === attribute
120
+ key = is_hash ? attribute.keys.first : attribute
121
+
122
+ # rel="bookmark" pattern
123
+ if bookmark = extract_bookmark(doc)
124
+ hash[:bookmark] = bookmark
125
+ end
126
+
127
+ found = doc/".#{key.no_bang.to_s.gsub('_','-')}"
128
+ raise InvalidMicroformat if found.empty? && key.to_s =~ /!/
129
+ next if found.empty?
130
+
131
+ if is_hash && Hash === attribute[key]
132
+ built_hash = build_hash(found, attribute[key])
133
+ key = key.no_bang
134
+ if built_hash.size.zero? && found.size.nonzero?
135
+ hash[key] = found.map { |f| parse_element(f) }
136
+ hash[key] = hash[key].first if name == :one
137
+ else
138
+ hash[key] = built_hash
139
+ end
140
+ else
141
+ target = is_hash ? attribute[key] : nil
142
+ key = key.no_bang
143
+ if name == :many
144
+ hash[key] ||= []
145
+ hash[key] += found.map { |f| parse_element(f, target) }
146
+ else
147
+ hash[key] = parse_element(found.first, target)
148
+ end
149
+ end
150
+ hash[key] = hash[key].first if Array === hash[key] && hash[key].size == 1
151
+ end
152
+ end
153
+
154
+ hash
155
+ end
156
+
157
+ def extract_bookmark(doc)
158
+ bookmark = doc.search("[@rel='bookmark']").first rescue nil
159
+ bookmark.attributes['href'] if bookmark.respond_to? :attributes
160
+ end
161
+
162
+ def parse_element(element, target = nil)
163
+ if target == :url
164
+ case element.name
165
+ when 'img' then element['src']
166
+ when 'a' then element['href']
167
+ when 'object' then element['value']
168
+ end
169
+ elsif Class === target
170
+ target.find(:first => element)
171
+ else
172
+ case element.name
173
+ when 'abbr' then element['title']
174
+ when 'img' then element['alt']
175
+ else element.innerHTML
176
+ end.strip.strip_html.coerce
177
+ end
178
+ end
179
+
180
+ def prepare_value(value)
181
+ Hash === value ? OpenStruct.new(value) : value
182
+ end
183
+ end
184
+
185
+ def method_missing(method, *args, &block)
186
+ return super(method, *args, &block) unless method == :properties || @properties.include?(method.to_s)
187
+ self.class.class_eval { define_method(method) { instance_variable_get("@#{method}") } }
188
+ instance_variable_get("@#{method}")
189
+ end
190
+
191
+ extend Base
192
+ end
193
+
194
+ class InvalidMicroformat < Exception; end
195
+ class MicroformatNotFound < Exception; end
196
+
197
+ # type & id are used a lot in uformats and deprecated in ruby. no loss.
198
+ OpenStruct.class_eval { undef :type, :id }
199
+ Symbol.class_eval { def no_bang() to_s.sub('!','').to_sym end }
@@ -0,0 +1,28 @@
1
+ require 'microformat'
2
+
3
+ class Microformat::Simple < String
4
+ extend Microformat::Base
5
+
6
+ class << self
7
+ def find_first(doc)
8
+ find_every(doc).first
9
+ end
10
+
11
+ def build_class(tags)
12
+ @from.inject([]) do |array, (key, value)|
13
+ tags.each_child do |tag|
14
+ next unless tag.respond_to? :attributes
15
+ array << new(tag.innerHTML) if tag.attributes[key.to_s] == value.to_s
16
+ end
17
+ array
18
+ end
19
+ end
20
+
21
+ def from(options)
22
+ @from ||= {}
23
+ options.each do |tag, value|
24
+ @from[tag] = value
25
+ end
26
+ end
27
+ end
28
+ end
@@ -0,0 +1,40 @@
1
+ require 'generator'
2
+ require 'date'
3
+ require 'time'
4
+
5
+ # http://project.ioni.st/post/925#post-925
6
+ class String
7
+ def coerce
8
+ attempt = nil
9
+ while coercions.next?
10
+ attempt = coercions.next
11
+ break if !attempt.nil?
12
+ end
13
+ %w[@coercions @generator].each { |i| remove_instance_variable i }
14
+ attempt.nil? ? self : attempt
15
+ end
16
+
17
+ def strip_html
18
+ gsub(/<(?:[^>'"]*|(['"]).*?\1)*>/,'')
19
+ end
20
+
21
+ private
22
+ def coercions
23
+ @coercions ||= Generator.new do |@generator|
24
+ try { self == 'true' }
25
+ try { [self == 'false', false] }
26
+ try { [Date.parse(self), Time.parse(self)] }
27
+ try { Integer(self) }
28
+ try { Float(self) }
29
+ end
30
+ end
31
+
32
+ def try
33
+ attempt, desired = yield
34
+ if attempt
35
+ @generator.yield(desired.nil? ? attempt : desired)
36
+ end
37
+ rescue ArgumentError
38
+ @generator.yield nil
39
+ end
40
+ end
@@ -0,0 +1,3 @@
1
+ $:.unshift File.join(File.dirname(__FILE__), "lib"), File.join(File.dirname(__FILE__))
2
+
3
+ %w[hentry hreview hcalendar rel_tag hcard xoxo].each { |format| require "mofo/#{format}" }
@@ -0,0 +1,12 @@
1
+ # => http://microformats.org/wiki/hcalendar
2
+ require 'microformat'
3
+
4
+ class HCalendar < Microformat
5
+ container :vevent
6
+
7
+ one :class, :description, :dtend, :dtstamp, :dtstart,
8
+ :duration, :location, :status, :summary, :uid,
9
+ :last_modified, :url => :url
10
+
11
+ many :category
12
+ end
@@ -0,0 +1,40 @@
1
+ # => http://microformats.org/wiki/hcard
2
+ require 'microformat'
3
+
4
+ class HCard < Microformat
5
+ container :vcard
6
+
7
+ one :fn, :bday, :tz, :sort_string, :uid, :class
8
+ many :label, :sound, :title, :role, :key,
9
+ :mailer, :rev, :nickname, :category, :note,
10
+ :logo => :url, :url => :url, :photo => :url
11
+
12
+ one :n do
13
+ one :family_name, :given_name, :additional_name
14
+ many :honorific_prefix, :honorific_suffix
15
+ end
16
+
17
+ many :email do
18
+ many :type
19
+ many :value
20
+ end
21
+
22
+ many :tel do
23
+ many :type
24
+ many :value
25
+ end
26
+
27
+ many :adr do
28
+ one :post_office_box, :extended_address, :street_address,
29
+ :locality, :region, :postal_code, :country_name, :value
30
+ many :type
31
+ end
32
+
33
+ one :geo do
34
+ one :latitude, :longitude
35
+ end
36
+
37
+ many :org do
38
+ one :organization_name, :organization_unit
39
+ end
40
+ end
@@ -0,0 +1,11 @@
1
+ # => http://microformats.org/wiki/hatom
2
+ require 'microformat'
3
+ require 'mofo/hcard'
4
+ require 'mofo/rel_tag'
5
+
6
+ class HEntry < Microformat
7
+ one :entry_title!, :entry_summary, :updated, :published,
8
+ :author => HCard
9
+
10
+ many :entry_content, :tags => RelTag
11
+ end
@@ -0,0 +1,6 @@
1
+ # => http://microformats.org/wiki/hatom
2
+ require 'mofo/hentry'
3
+
4
+ class HFeed < Microformat
5
+ many :hentry => HEntry
6
+ end
@@ -0,0 +1,16 @@
1
+ # => http://microformats.org/wiki/hreview
2
+ require 'microformat'
3
+ require 'mofo/hcard'
4
+ require 'mofo/rel_tag'
5
+
6
+ class HReview < Microformat
7
+ one :version, :summary, :type, :dtreviewed, :rating, :description
8
+
9
+ one :reviewer => HCard
10
+
11
+ one :item! do
12
+ one :fn
13
+ end
14
+
15
+ many :tags => RelTag
16
+ end
@@ -0,0 +1,7 @@
1
+ # => http://microformats.org/wiki/rel-design-pattern
2
+ require 'microformat/simple'
3
+
4
+ class RelTag < Microformat::Simple
5
+ container :tags
6
+ from :rel => :tag
7
+ end
@@ -0,0 +1,56 @@
1
+ $:.unshift 'lib'
2
+ require 'microformat'
3
+
4
+ class XOXO < Microformat
5
+ class << self
6
+ PARENTS = %w[ol ul]
7
+ CHILDREN = %w[li]
8
+
9
+ xpath_build = proc { |element| element.map { |e| "/#{e}" } * ' | ' }
10
+ CHILDREN_XPATH = xpath_build.call(CHILDREN)
11
+ PARENTS_XPATH = xpath_build.call(PARENTS)
12
+
13
+ def find_first(doc)
14
+ find_every(doc).first
15
+ end
16
+
17
+ def find_every(doc)
18
+ tree = []
19
+ doc.each do |child|
20
+ tree << build_tree(child)
21
+ end
22
+ tree
23
+ end
24
+
25
+ def find_occurances(doc)
26
+ @options[:class] ? doc/".xoxo" : doc.search(PARENTS_XPATH)
27
+ end
28
+
29
+ def build_tree(child)
30
+ tree = []
31
+ child.search(CHILDREN_XPATH) do |element|
32
+ label, branch = nil, nil
33
+ element.children.each do |inner|
34
+ label ||= build_label(inner) unless inner.elem? && PARENTS.include?(inner.name)
35
+ branch ||= build_tree(inner) if inner.elem? && PARENTS.include?(inner.name)
36
+ end
37
+ tree << (branch ? { label => branch } : label)
38
+ end
39
+ tree
40
+ end
41
+
42
+ def build_label(node)
43
+ if node.elem?
44
+ label = Label.new(node.innerHTML.strip)
45
+ label.url = node['href'] if node.name == 'a'
46
+ label
47
+ elsif node.text? && !node.to_s.strip.empty?
48
+ node.to_s.strip
49
+ end
50
+ end
51
+ end
52
+
53
+ class Label < String
54
+ attr_accessor :url
55
+ end
56
+ end
metadata ADDED
@@ -0,0 +1,61 @@
1
+ --- !ruby/object:Gem::Specification
2
+ rubygems_version: 0.9.0
3
+ specification_version: 1
4
+ name: mofo
5
+ version: !ruby/object:Gem::Version
6
+ version: "0.1"
7
+ date: 2006-11-09 00:00:00 -08:00
8
+ summary: mofo is a ruby microformat parser
9
+ require_paths:
10
+ - lib
11
+ email: chris[at]ozmm[dot]org
12
+ homepage: http://mofo.rubyforge.org/
13
+ rubyforge_project:
14
+ description: mofo is a ruby microformat parser
15
+ autorequire: mofo
16
+ default_executable:
17
+ bindir: bin
18
+ has_rdoc: false
19
+ required_ruby_version: !ruby/object:Gem::Version::Requirement
20
+ requirements:
21
+ - - ">"
22
+ - !ruby/object:Gem::Version
23
+ version: 0.0.0
24
+ version:
25
+ platform: ruby
26
+ signing_key:
27
+ cert_chain:
28
+ post_install_message:
29
+ authors:
30
+ - Chris Wanstrath
31
+ files:
32
+ - README
33
+ - CHANGELOG
34
+ - LICENSE
35
+ - lib/microformat
36
+ - lib/microformat.rb
37
+ - lib/mofo
38
+ - lib/mofo.rb
39
+ - lib/microformat/simple.rb
40
+ - lib/microformat/string.rb
41
+ - lib/mofo/hcalendar.rb
42
+ - lib/mofo/hcard.rb
43
+ - lib/mofo/hentry.rb
44
+ - lib/mofo/hfeed.rb
45
+ - lib/mofo/hreview.rb
46
+ - lib/mofo/rel_tag.rb
47
+ - lib/mofo/xoxo.rb
48
+ test_files: []
49
+
50
+ rdoc_options: []
51
+
52
+ extra_rdoc_files: []
53
+
54
+ executables: []
55
+
56
+ extensions: []
57
+
58
+ requirements: []
59
+
60
+ dependencies: []
61
+