syndication 0.6.2 → 0.6.3

Sign up to get free protection for your applications and to get access to all the features.
data/README CHANGED
@@ -1,228 +1,228 @@
1
- # = Syndication 0.6
2
- #
3
- # This module provides classes for parsing web syndication feeds in RSS and
4
- # Atom formats.
5
- #
6
- # To parse RSS, use Syndication::RSS::Parser.
7
- #
8
- # To parse Atom, use Syndication::Atom::Parser.
9
- #
10
- # If you want my advice on which to generate, my order of preference would
11
- # be:
12
- #
13
- # 1. Atom 1.0
14
- # 2. RSS 1.0
15
- # 3. RSS 2.0
16
- #
17
- # My reasoning is simply that I hate having to sniff for HTML (see
18
- # Syndication::RSS).
19
- #
20
- # == License
21
- #
22
- # Syndication is Copyright 2005-2006 mathew <meta@pobox.com>, and is licensed
23
- # under the same terms as Ruby.
24
- #
25
- # == Requirements
26
- #
27
- # Built and tested using Ruby 1.8.4. Needs only the standard library.
28
- #
29
- # == Rationale
30
- #
31
- # Ruby already has an RSS library as part of the standard library, so you
32
- # might be wondering why I decided to write another one.
33
- #
34
- # I started out trying to document the standard rss module, but found the
35
- # code rather impenetrable. It was also difficult to see how it could be made
36
- # documentable via Rdoc.
37
- #
38
- # Then I tried writing code to use the standard RSS library, and discovered
39
- # that it had a number of (what I consider to be) defects:
40
- #
41
- # - It doesn't support RSS 2.0 with extensions (such as iTunes podcast feeds),
42
- # and it wasn't clear to me how to extend it to do so.
43
- #
44
- # - It doesn't support RSS 0.9.
45
- #
46
- # - It doesn't support Atom.
47
- #
48
- # - The API is different depending on what kind of RSS feed you are parsing.
49
- #
50
- # I asked around, and discovered that I wasn't the only person dissatisfied
51
- # with the RSS library. Since fixing the problems would have resulted in
52
- # breaking existing code that used the RSS module, I opted for an all-new
53
- # implementation.
54
- #
55
- # This is the result. The first release was version 0.4, which was actually my
56
- # fourth attempt at putting together a clean, simple, universal API for RSS
57
- # and Atom parsing. (The first three never saw public release.)
58
- #
59
- # == Features
60
- #
61
- # Here are what I see as the key improvements over the rss module in the
62
- # Ruby standard library:
63
- #
64
- # - Supports all RSS versions, including RSS 0.9, as well as Atom.
65
- #
66
- # - Provides a unified API/object model for accessing the decoded data,
67
- # with no need to know what format the feed is in.
68
- #
69
- # - Allows use of extended RSS 2.0 feeds.
70
- #
71
- # - Simple API, fully documented.
72
- #
73
- # - Test suite with over 220 test assertions.
74
- #
75
- # - Commented source code.
76
- #
77
- # - Less source code than the standard library rss module.
78
- #
79
- # - Faster than the standard library (at least, in my tests).
80
- #
81
- # Other features:
82
- #
83
- # - Optional support for RSS 1.0 Dublin Core, Syndication and Content modules,
84
- # Apple iTunes Podcast elements, and Google Calendar.
85
- #
86
- # - Content module decodes CDATA-escaped or encoded HTML content for you.
87
- #
88
- # - Supports namespaces, and encoded XHTML/HTML in Atom feeds.
89
- #
90
- # - Dates decoded to Ruby DateTime objects. Note, however, that this is slow,
91
- # so parsing is only performed if you ask for the value.
92
- #
93
- # - Simple to extend to support your own RSS extensions, uses reflection.
94
- #
95
- # - Uses REXML fast stream parsing API for speed, or built-in TagSoup parser
96
- # for invalid feeds.
97
- #
98
- # - Non-validating, tries to be as forgiving as possible of structural errors.
99
- #
100
- # - Remaps namespace prefixes to standard values if it recognizes the module's
101
- # URL.
102
- #
103
- # In the interests of balance, here are some key disadvantages over the
104
- # standard library RSS support:
105
- #
106
- # - No support for _generating_ RSS feeds, only for parsing them. If
107
- # you're using Rails, you can use RXML; if not, you can use rss/maker.
108
- # My feeling is that XML generation isn't a wheel that needs reinventing.
109
- #
110
- # - Different API, not a drop-in replacement.
111
- #
112
- # - Incomplete support for Atom 0.3 draft. (Anyone still using it?)
113
- #
114
- # - No support for base64 data in Atom feeds (yet).
115
- #
116
- # - No Japanese documentation.
117
- #
118
- # - No XSL output options.
119
- #
120
- # - Slower if there are dates in the feed and you ask for their values.
121
- #
122
- # == Other options
123
- #
124
- # There are, of course, other Ruby RSS/Atom libraries out there. The ones I
125
- # know about:
126
- #
127
- # = simple-rss
128
- #
129
- # http://rubyforge.org/projects/simple-rss
130
- #
131
- # Pros:
132
- # - Much smaller than syndication or rss.
133
- #
134
- # - Completely non-validating.
135
- #
136
- # - Backwards compatible with rss in standard library.
137
- #
138
- # Cons:
139
- # - Doesn't use a real XML parser.
140
- #
141
- # - No support for namespaces.
142
- #
143
- # - Incomplete Atom support (e.g. can't get name and e-mail of <atom:person>
144
- # elements as separate fields, you still have to decode XHTML data yourself)
145
- #
146
- # - No documentation.
147
- #
148
- # For the record, I started work on my library long before simple-rss was
149
- # announced.
150
- #
151
- # = feedtools
152
- #
153
- # http://rubyforge.org/projects/feedtools/
154
- #
155
- # This one solves most of the same problems as Syndication; however the two
156
- # were developed in parallel, in ignorance of each other.
157
- #
158
- # Feedtools builds in database caching and persistance, and HTTP fetching.
159
- # Personally, I don't think those belong in a feed parsing library--they
160
- # are easily implemented using other standard libraries if you want them.
161
- #
162
- # Pros:
163
- # - Lots of test cases.
164
- #
165
- # - Used by lots of Rails people.
166
- #
167
- # - Knows about many more namespaces.
168
- #
169
- # - Can generate feeds.
170
- #
171
- # Cons:
172
- # - Skimpy documentation.
173
- #
174
- # - Uses HTree then XPath parsing, rather than a single stream parse.
175
- #
176
- # - Tries to unify RSS and Atom APIs, at the expense of Atom functionality.
177
- # (Which could also be a pro, depending on your viewpoint.)
178
- #
179
- # == Design philosophy
180
- #
181
- # Here's my design philosophy for this module:
182
- #
183
- # - The interface should be via standard Ruby objects and methods; e.g.
184
- # feed.channel.item[0].title, rather than (say) a dictionary hash.
185
- #
186
- # - It should be easier to parse RSS via the module than to hack something
187
- # together using REXML, even if all you want is a list of titles and URLs.
188
- #
189
- # - It should be easy to add support for new RSS extensions without needing
190
- # to know anything about reflection or other advanced topics. Just define
191
- # a mixin with a bunch of appropriately-named methods, and you're done.
192
- #
193
- # - The code should be simple to understand.
194
- #
195
- # - Even so, good complete documentation is extremely important.
196
- #
197
- # - Be lenient in what you accept.
198
- #
199
- # - Be conservative in what you generate.
200
- #
201
- # - Get well-formed feeds parsing reliably, then worry about broken feeds.
202
- #
203
- # - Atom will hopefully be the future. Provide full support for RSS, but don't
204
- # hold Atom back by trying to force it into an RSS data model.
205
- #
206
- # == Future plans
207
- #
208
- # Here are some possible improvements:
209
- #
210
- # - RSS and Atom generation. Create objects, then call Syndication::FeedMaker
211
- # to generate XML in various flavors. This probably won't happen until an XML
212
- # generator is picked for the Ruby standard library.
213
- #
214
- # - Faster date parsing. It turns out that when I asked for parsed dates in
215
- # my test code, the profiler showed Date.parse chewing up 25% of the total
216
- # CPU time used. A more specific ISO8601 specific date parser could cut
217
- # that down drastically.
218
- #
219
- # - Additional Google Data support. I just wanted to be able to display my
220
- # upcoming calendar dates, but clearly there is a lot more that could be
221
- # implemented. Unfortunately, recurring events don't seem to have a clean
222
- # XML representation in Google's data feeds yet.
223
- #
224
- # == Feedback
225
- #
226
- # There are doubtless things I could have done better. Comments, suggestions,
227
- # etc are welcome; e-mail <meta@pobox.com>.
228
- #
1
+ = Syndication 0.6
2
+
3
+ This module provides classes for parsing web syndication feeds in RSS and
4
+ Atom formats.
5
+
6
+ To parse RSS, use Syndication::RSS::Parser.
7
+
8
+ To parse Atom, use Syndication::Atom::Parser.
9
+
10
+ If you want my advice on which to generate, my order of preference would
11
+ be:
12
+
13
+ 1. Atom 1.0
14
+ 2. RSS 1.0
15
+ 3. RSS 2.0
16
+
17
+ My reasoning is simply that I hate having to sniff for HTML (see
18
+ Syndication::RSS).
19
+
20
+ == License
21
+
22
+ Syndication is Copyright 2005-2006 mathew <meta@pobox.com>, and is licensed
23
+ under the same terms as Ruby.
24
+
25
+ == Requirements
26
+
27
+ Built and tested using Ruby 1.8.4. Needs only the standard library.
28
+
29
+ == Rationale
30
+
31
+ Ruby already has an RSS library as part of the standard library, so you
32
+ might be wondering why I decided to write another one.
33
+
34
+ I started out trying to document the standard rss module, but found the
35
+ code rather impenetrable. It was also difficult to see how it could be made
36
+ documentable via Rdoc.
37
+
38
+ Then I tried writing code to use the standard RSS library, and discovered
39
+ that it had a number of (what I consider to be) defects:
40
+
41
+ - It doesn't support RSS 2.0 with extensions (such as iTunes podcast feeds),
42
+ and it wasn't clear to me how to extend it to do so.
43
+
44
+ - It doesn't support RSS 0.9.
45
+
46
+ - It doesn't support Atom.
47
+
48
+ - The API is different depending on what kind of RSS feed you are parsing.
49
+
50
+ I asked around, and discovered that I wasn't the only person dissatisfied
51
+ with the RSS library. Since fixing the problems would have resulted in
52
+ breaking existing code that used the RSS module, I opted for an all-new
53
+ implementation.
54
+
55
+ This is the result. The first release was version 0.4, which was actually my
56
+ fourth attempt at putting together a clean, simple, universal API for RSS
57
+ and Atom parsing. (The first three never saw public release.)
58
+
59
+ == Features
60
+
61
+ Here are what I see as the key improvements over the rss module in the
62
+ Ruby standard library:
63
+
64
+ - Supports all RSS versions, including RSS 0.9, as well as Atom.
65
+
66
+ - Provides a unified API/object model for accessing the decoded data,
67
+ with no need to know what format the feed is in.
68
+
69
+ - Allows use of extended RSS 2.0 feeds.
70
+
71
+ - Simple API, fully documented.
72
+
73
+ - Test suite with over 220 test assertions.
74
+
75
+ - Commented source code.
76
+
77
+ - Less source code than the standard library rss module.
78
+
79
+ - Faster than the standard library (at least, in my tests).
80
+
81
+ Other features:
82
+
83
+ - Optional support for RSS 1.0 Dublin Core, Syndication and Content modules,
84
+ Apple iTunes Podcast elements, and Google Calendar.
85
+
86
+ - Content module decodes CDATA-escaped or encoded HTML content for you.
87
+
88
+ - Supports namespaces, and encoded XHTML/HTML in Atom feeds.
89
+
90
+ - Dates decoded to Ruby DateTime objects. Note, however, that this is slow,
91
+ so parsing is only performed if you ask for the value.
92
+
93
+ - Simple to extend to support your own RSS extensions, uses reflection.
94
+
95
+ - Uses REXML fast stream parsing API for speed, or built-in TagSoup parser
96
+ for invalid feeds.
97
+
98
+ - Non-validating, tries to be as forgiving as possible of structural errors.
99
+
100
+ - Remaps namespace prefixes to standard values if it recognizes the module's
101
+ URL.
102
+
103
+ In the interests of balance, here are some key disadvantages over the
104
+ standard library RSS support:
105
+
106
+ - No support for _generating_ RSS feeds, only for parsing them. If
107
+ you're using Rails, you can use RXML; if not, you can use rss/maker.
108
+ My feeling is that XML generation isn't a wheel that needs reinventing.
109
+
110
+ - Different API, not a drop-in replacement.
111
+
112
+ - Incomplete support for Atom 0.3 draft. (Anyone still using it?)
113
+
114
+ - No support for base64 data in Atom feeds (yet).
115
+
116
+ - No Japanese documentation.
117
+
118
+ - No XSL output options.
119
+
120
+ - Slower if there are dates in the feed and you ask for their values.
121
+
122
+ == Other options
123
+
124
+ There are, of course, other Ruby RSS/Atom libraries out there. The ones I
125
+ know about:
126
+
127
+ = simple-rss
128
+
129
+ http://rubyforge.org/projects/simple-rss
130
+
131
+ Pros:
132
+ - Much smaller than syndication or rss.
133
+
134
+ - Completely non-validating.
135
+
136
+ - Backwards compatible with rss in standard library.
137
+
138
+ Cons:
139
+ - Doesn't use a real XML parser.
140
+
141
+ - No support for namespaces.
142
+
143
+ - Incomplete Atom support (e.g. can't get name and e-mail of <atom:person>
144
+ elements as separate fields, you still have to decode XHTML data yourself)
145
+
146
+ - No documentation.
147
+
148
+ For the record, I started work on my library long before simple-rss was
149
+ announced.
150
+
151
+ = feedtools
152
+
153
+ http://rubyforge.org/projects/feedtools/
154
+
155
+ This one solves most of the same problems as Syndication; however the two
156
+ were developed in parallel, in ignorance of each other.
157
+
158
+ Feedtools builds in database caching and persistance, and HTTP fetching.
159
+ Personally, I don't think those belong in a feed parsing library--they
160
+ are easily implemented using other standard libraries if you want them.
161
+
162
+ Pros:
163
+ - Lots of test cases.
164
+
165
+ - Used by lots of Rails people.
166
+
167
+ - Knows about many more namespaces.
168
+
169
+ - Can generate feeds.
170
+
171
+ Cons:
172
+ - Skimpy documentation.
173
+
174
+ - Uses HTree then XPath parsing, rather than a single stream parse.
175
+
176
+ - Tries to unify RSS and Atom APIs, at the expense of Atom functionality.
177
+ (Which could also be a pro, depending on your viewpoint.)
178
+
179
+ == Design philosophy
180
+
181
+ Here's my design philosophy for this module:
182
+
183
+ - The interface should be via standard Ruby objects and methods; e.g.
184
+ feed.channel.item[0].title, rather than (say) a dictionary hash.
185
+
186
+ - It should be easier to parse RSS via the module than to hack something
187
+ together using REXML, even if all you want is a list of titles and URLs.
188
+
189
+ - It should be easy to add support for new RSS extensions without needing
190
+ to know anything about reflection or other advanced topics. Just define
191
+ a mixin with a bunch of appropriately-named methods, and you're done.
192
+
193
+ - The code should be simple to understand.
194
+
195
+ - Even so, good complete documentation is extremely important.
196
+
197
+ - Be lenient in what you accept.
198
+
199
+ - Be conservative in what you generate.
200
+
201
+ - Get well-formed feeds parsing reliably, then worry about broken feeds.
202
+
203
+ - Atom will hopefully be the future. Provide full support for RSS, but don't
204
+ hold Atom back by trying to force it into an RSS data model.
205
+
206
+ == Future plans
207
+
208
+ Here are some possible improvements:
209
+
210
+ - RSS and Atom generation. Create objects, then call Syndication::FeedMaker
211
+ to generate XML in various flavors. This probably won't happen until an XML
212
+ generator is picked for the Ruby standard library.
213
+
214
+ - Faster date parsing. It turns out that when I asked for parsed dates in
215
+ my test code, the profiler showed Date.parse chewing up 25% of the total
216
+ CPU time used. A more specific ISO8601 specific date parser could cut
217
+ that down drastically.
218
+
219
+ - Additional Google Data support. I just wanted to be able to display my
220
+ upcoming calendar dates, but clearly there is a lot more that could be
221
+ implemented. Unfortunately, recurring events don't seem to have a clean
222
+ XML representation in Google's data feeds yet.
223
+
224
+ == Feedback
225
+
226
+ There are doubtless things I could have done better. Comments, suggestions,
227
+ etc are welcome; e-mail <meta@pobox.com>.
228
+
@@ -0,0 +1,17 @@
1
+
2
+ require 'open-uri'
3
+ require '~/WIP/syndication/trunk/syndication/lib/syndication/atom'
4
+ require 'pp'
5
+
6
+ parser = Syndication::Atom::Parser.new
7
+ feed = nil
8
+ open("http://blog.pastie.org/atom.xml") {|file|
9
+ text = file.read
10
+ feed = parser.parse(text)
11
+ }
12
+ puts "#{feed.title.txt}"
13
+ for i in feed.entries
14
+ puts "#{i.title.txt}: #{i.summary.txt}"
15
+ content = i.content
16
+ puts content.xml
17
+ end
data/examples/podcast.rb CHANGED
File without changes
@@ -0,0 +1,19 @@
1
+
2
+ # Example of parsing some SVG out of a feed
3
+
4
+ require 'open-uri'
5
+ require 'syndication/atom'
6
+ require 'pp'
7
+
8
+ parser = Syndication::Atom::Parser.new
9
+ feed = nil
10
+ open("svgexample.xml") {|file|
11
+ text = file.read
12
+ feed = parser.parse(text)
13
+ }
14
+ puts "#{feed.title.txt}"
15
+ for i in feed.entries
16
+ puts "#{i.title.txt}: #{i.summary.txt}"
17
+ content = i.content
18
+ puts content.xml
19
+ end
@@ -0,0 +1,28 @@
1
+ <?xml version="1.0" encoding="utf-8"?>
2
+ <feed xmlns="http://www.w3.org/2005/Atom">
3
+
4
+ <title>Example Feed</title>
5
+ <link href="http://example.org/"/>
6
+ <updated>2003-12-13T18:30:02Z</updated>
7
+ <author>
8
+ <name>John Doe</name>
9
+ </author>
10
+ <id>urn:uuid:60a76c80-d399-11d9-b93C-0003939e0af6</id>
11
+
12
+ <entry>
13
+ <title>Atom-Powered Robots Run Amok</title>
14
+ <link href="http://example.org/2003/12/13/atom03"/>
15
+ <id>urn:uuid:1225c695-cfb8-4ebb-aaaa-80da344efa6a</id>
16
+ <updated>2003-12-13T18:30:02Z</updated>
17
+ <summary>Some text.</summary>
18
+ <content type="image/svg+xml">
19
+ <svg xmlns="http://www.w3.org/2000/svg"
20
+ width="100px" height="100px">
21
+ <title>Itsy bitsy SVG</title>
22
+ <circle cx="40" cy="25" r="20" style="fill: black;"/>
23
+ <text x="10" y="80" fill="blue">Hello World</text>
24
+ </svg>
25
+ </content>
26
+ </entry>
27
+
28
+ </feed>
@@ -45,7 +45,7 @@ module Syndication
45
45
  def tag_start(tag, attrs = nil)
46
46
  method = tag2method(tag)
47
47
  if self.respond_to?(method)
48
- if attrs
48
+ if attrs and !attrs.empty?
49
49
  self.send(method, attrs)
50
50
  end
51
51
  @current_method = method
data/rakefile CHANGED
@@ -5,7 +5,7 @@ require 'rake/gempackagetask'
5
5
  require 'rake/testtask'
6
6
  require 'rubygems'
7
7
 
8
- PKG_VERSION = "0.6.2"
8
+ PKG_VERSION = "0.6.3"
9
9
 
10
10
  desc "Create HTML documentation from RDOC"
11
11
  Rake::RDocTask.new do |rd|
data/test/google.rb CHANGED
@@ -1,4 +1,5 @@
1
- # Copyright © mathew <meta@pobox.com> 2005.
1
+ # Encoding: UTF-8
2
+ # Copyright © mathew <meta@pobox.com> 2005,2010.
2
3
  # Licensed under the same terms as Ruby.
3
4
 
4
5
  require 'syndication/atom'
metadata CHANGED
@@ -1,7 +1,12 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: syndication
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.6.2
4
+ prerelease: false
5
+ segments:
6
+ - 0
7
+ - 6
8
+ - 3
9
+ version: 0.6.3
5
10
  platform: ruby
6
11
  authors:
7
12
  - mathew
@@ -9,7 +14,7 @@ autorequire:
9
14
  bindir: bin
10
15
  cert_chain: []
11
16
 
12
- date: 2009-12-19 00:00:00 -06:00
17
+ date: 2010-11-17 00:00:00 -06:00
13
18
  default_executable:
14
19
  dependencies: []
15
20
 
@@ -25,24 +30,27 @@ extra_rdoc_files:
25
30
  - CHANGES
26
31
  - DEVELOPER
27
32
  files:
28
- - lib/syndication/podcast.rb
29
- - lib/syndication/common.rb
30
- - lib/syndication/rss.rb
31
33
  - lib/syndication/google.rb
32
- - lib/syndication/feedburner.rb
34
+ - lib/syndication/dublincore.rb
33
35
  - lib/syndication/content.rb
36
+ - lib/syndication/rss.rb
34
37
  - lib/syndication/atom.rb
35
- - lib/syndication/dublincore.rb
36
- - lib/syndication/tagsoup.rb
37
38
  - lib/syndication/syndication.rb
39
+ - lib/syndication/common.rb
40
+ - lib/syndication/podcast.rb
41
+ - lib/syndication/tagsoup.rb
42
+ - lib/syndication/feedburner.rb
43
+ - test/google.rb
44
+ - test/feedburntest.rb
38
45
  - test/tagsouptest.rb
39
46
  - test/atomtest.rb
40
- - test/feedburntest.rb
41
- - test/google.rb
42
47
  - test/rsstest.rb
43
- - examples/podcast.rb
44
48
  - examples/google.rb
49
+ - examples/pastie.rb
50
+ - examples/svgexample.rb
45
51
  - examples/yahoo.rb
52
+ - examples/svgexample.xml
53
+ - examples/podcast.rb
46
54
  - examples/apple.rb
47
55
  - rakefile
48
56
  - README
@@ -59,21 +67,25 @@ rdoc_options: []
59
67
  require_paths:
60
68
  - lib
61
69
  required_ruby_version: !ruby/object:Gem::Requirement
70
+ none: false
62
71
  requirements:
63
72
  - - ">="
64
73
  - !ruby/object:Gem::Version
74
+ segments:
75
+ - 0
65
76
  version: "0"
66
- version:
67
77
  required_rubygems_version: !ruby/object:Gem::Requirement
78
+ none: false
68
79
  requirements:
69
80
  - - ">="
70
81
  - !ruby/object:Gem::Version
82
+ segments:
83
+ - 0
71
84
  version: "0"
72
- version:
73
85
  requirements: []
74
86
 
75
87
  rubyforge_project: syndication
76
- rubygems_version: 1.3.5
88
+ rubygems_version: 1.3.7
77
89
  signing_key:
78
90
  specification_version: 3
79
91
  summary: A web syndication parser for Atom and RSS with a uniform API