simple-rss 2.0.0 → 2.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 1d7399eb7c1eee38cca3bd3de50a18cdedbf85b1aeb8536688b628abca814cab
4
- data.tar.gz: 7a27767cd9be24307344430c311507859ac825d7845ea9e703985355dd3ccfa3
3
+ metadata.gz: 0b914acfc63bfc4e787b6a0373e6c4fbbcec1b953a6517a2709a70b96e5993a6
4
+ data.tar.gz: 26e9dddcca6e05b34e8ef6da52a3654fadb6932ac9558f9ee8e70117e99f1e9f
5
5
  SHA512:
6
- metadata.gz: 6e01cdc590e4fdac9d0be8df7307491b8f67e8b68422d7970b589b72e52872f8dab23935a6225b05993f4adf614f4641d24e0a23538daf7ed512969d1104a84f
7
- data.tar.gz: 3568cebc46ca358c7d3a49b0615d49592ed7698e5fcb0c60a3a392c1904de4d4fb7d03fec3ac1d5270ada6c8679737c3a91b880e8dfb91d7aaa69bbfc1f9dab4
6
+ metadata.gz: 0bbb1967e261cec7c2fdb1bb00fd511562a4647043c40f5b7b6e56aedc5d8d8f003727988ffd2fb77fb7291b397d41008e40eb3a14d156889168423d90f934c5
7
+ data.tar.gz: c1428ee431c4bfd718a573d2d89e5121e9cef46cbd1ce5ade80b21b9bc98b939f35dcdd133287fb70de83c3a00b7a5099d97a2f12699c6224a194c874a42e215
data/README.md ADDED
@@ -0,0 +1,321 @@
1
+ # SimpleRSS
2
+
3
+ [![Gem Version](https://badge.fury.io/rb/simple-rss.svg)](https://badge.fury.io/rb/simple-rss)
4
+ [![CI](https://github.com/cardmagic/simple-rss/actions/workflows/ruby.yml/badge.svg)](https://github.com/cardmagic/simple-rss/actions/workflows/ruby.yml)
5
+ [![License: LGPL](https://img.shields.io/badge/License-LGPL-blue.svg)](https://opensource.org/licenses/LGPL-3.0)
6
+
7
+ A simple, flexible, extensible, and liberal RSS and Atom reader for Ruby. Designed to be backwards compatible with Ruby's standard RSS parser while handling malformed feeds gracefully.
8
+
9
+ ## Features
10
+
11
+ - Parses both RSS and Atom feeds
12
+ - Tolerant of malformed XML (regex-based parsing)
13
+ - Built-in URL fetching with conditional GET support (ETags, Last-Modified)
14
+ - JSON and XML serialization
15
+ - Extensible tag definitions
16
+ - Zero runtime dependencies
17
+
18
+ ## What's New in 2.0
19
+
20
+ Version 2.0 is a major update with powerful new capabilities:
21
+
22
+ - **URL Fetching** - One-liner feed fetching with `SimpleRSS.fetch(url)`. Supports timeouts, custom headers, and automatic redirect following.
23
+
24
+ - **Conditional GET** - Bandwidth-efficient polling with ETag and Last-Modified support. Returns `nil` when feeds haven't changed (304 Not Modified).
25
+
26
+ - **JSON Serialization** - Export feeds with `to_json`, `to_hash`, and Rails-compatible `as_json`. Time objects serialize to ISO 8601.
27
+
28
+ - **XML Serialization** - Convert any parsed feed to clean RSS 2.0 or Atom XML with `to_xml(format: :rss2)` or `to_xml(format: :atom)`.
29
+
30
+ - **Array Tags** - Collect all occurrences of a tag (like multiple categories) with the `array_tags:` option.
31
+
32
+ - **Attribute Parsing** - Extract attributes from feed, item, and media tags using the `tag#attr` syntax.
33
+
34
+ - **UTF-8 Normalization** - All parsed content is automatically normalized to UTF-8 encoding.
35
+
36
+ - **Modern Ruby** - Full compatibility with Ruby 3.1 through 4.0, with RBS type annotations and Steep type checking.
37
+
38
+ - **Enumerable Support** - Iterate feeds naturally with `each`, `map`, `select`, and all Enumerable methods. Access items by index with `rss[0]` and get the latest items sorted by date with `latest(n)`.
39
+
40
+ ## Installation
41
+
42
+ Add to your Gemfile:
43
+
44
+ ```ruby
45
+ gem "simple-rss"
46
+ ```
47
+
48
+ Or install directly:
49
+
50
+ ```bash
51
+ gem install simple-rss
52
+ ```
53
+
54
+ ## Quick Start
55
+
56
+ ```ruby
57
+ require "simple-rss"
58
+ require "uri"
59
+ require "net/http"
60
+
61
+ # Parse from a string or IO object
62
+ xml = Net::HTTP.get(URI("https://example.com/feed.xml"))
63
+ rss = SimpleRSS.parse(xml)
64
+
65
+ rss.channel.title # => "Example Feed"
66
+ rss.items.first.title # => "First Post"
67
+ rss.items.first.pubDate # => 2024-01-15 12:00:00 -0500 (Time object)
68
+ ```
69
+
70
+ ## Usage
71
+
72
+ ### Fetching Feeds
73
+
74
+ SimpleRSS includes a built-in fetcher with conditional GET support for efficient polling:
75
+
76
+ ```ruby
77
+ # Simple fetch
78
+ feed = SimpleRSS.fetch("https://example.com/feed.xml")
79
+
80
+ # With timeout
81
+ feed = SimpleRSS.fetch("https://example.com/feed.xml", timeout: 10)
82
+
83
+ # Conditional GET - only download if modified
84
+ feed = SimpleRSS.fetch("https://example.com/feed.xml")
85
+ # Store these for next request
86
+ etag = feed.etag
87
+ last_modified = feed.last_modified
88
+
89
+ # On subsequent requests, pass the stored values
90
+ feed = SimpleRSS.fetch(
91
+ "https://example.com/feed.xml",
92
+ etag:,
93
+ last_modified:
94
+ )
95
+ # Returns nil if feed hasn't changed (304 Not Modified)
96
+ ```
97
+
98
+ ### Accessing Feed Data
99
+
100
+ SimpleRSS provides both RSS and Atom style accessors:
101
+
102
+ ```ruby
103
+ feed = SimpleRSS.parse(xml)
104
+
105
+ # RSS style
106
+ feed.channel.title
107
+ feed.channel.link
108
+ feed.channel.description
109
+ feed.items
110
+
111
+ # Atom style (aliases)
112
+ feed.feed.title
113
+ feed.entries
114
+ ```
115
+
116
+ ### Item Attributes
117
+
118
+ Items support both hash and method access:
119
+
120
+ ```ruby
121
+ item = feed.items.first
122
+
123
+ # Hash access
124
+ item[:title]
125
+ item[:link]
126
+ item[:pubDate]
127
+
128
+ # Method access
129
+ item.title
130
+ item.link
131
+ item.pubDate
132
+ ```
133
+
134
+ Date fields are automatically parsed into `Time` objects:
135
+
136
+ ```ruby
137
+ item.pubDate.class # => Time
138
+ item.pubDate.year # => 2024
139
+ ```
140
+
141
+ ### Iterating with Enumerable
142
+
143
+ SimpleRSS includes `Enumerable`, so you can iterate feeds naturally:
144
+
145
+ ```ruby
146
+ feed = SimpleRSS.parse(xml)
147
+
148
+ # Iterate over items
149
+ feed.each { |item| puts item.title }
150
+
151
+ # Use any Enumerable method
152
+ titles = feed.map { |item| item.title }
153
+ tech_posts = feed.select { |item| item.category == "tech" }
154
+ first_five = feed.first(5)
155
+ total = feed.count
156
+
157
+ # Access items by index
158
+ feed[0].title # first item
159
+ feed[-1].title # last item
160
+
161
+ # Get the n most recent items (sorted by pubDate or updated)
162
+ feed.latest(10)
163
+ ```
164
+
165
+ ### JSON Serialization
166
+
167
+ ```ruby
168
+ feed = SimpleRSS.parse(xml)
169
+
170
+ # Get as hash
171
+ feed.to_hash
172
+ # => { title: "Feed Title", link: "...", items: [...] }
173
+
174
+ # Get as JSON string
175
+ feed.to_json
176
+ # => '{"title":"Feed Title","link":"...","items":[...]}'
177
+
178
+ # Works with Rails/ActiveSupport
179
+ feed.as_json
180
+ ```
181
+
182
+ ### XML Serialization
183
+
184
+ Convert parsed feeds to standard RSS 2.0 or Atom format:
185
+
186
+ ```ruby
187
+ feed = SimpleRSS.parse(xml)
188
+
189
+ # Convert to RSS 2.0
190
+ feed.to_xml(format: :rss2)
191
+
192
+ # Convert to Atom
193
+ feed.to_xml(format: :atom)
194
+ ```
195
+
196
+ ### Extending Tag Support
197
+
198
+ Add support for custom or non-standard tags:
199
+
200
+ ```ruby
201
+ # Add a new feed-level tag
202
+ SimpleRSS.feed_tags << :custom_tag
203
+
204
+ # Add item-level tags
205
+ SimpleRSS.item_tags << :custom_item_tag
206
+
207
+ # Parse tags with specific rel attributes (common in Atom)
208
+ SimpleRSS.item_tags << :"link+enclosure"
209
+ # Accessible as: item.link_enclosure
210
+
211
+ # Parse tag attributes
212
+ SimpleRSS.item_tags << :"media:content#url"
213
+ # Accessible as: item.media_content_url
214
+
215
+ # Parse item/entry attributes
216
+ SimpleRSS.item_tags << :"entry#xml:lang"
217
+ # Accessible as: item.entry_xml_lang
218
+ ```
219
+
220
+ #### Tag Syntax Reference
221
+
222
+ | Syntax | Example | Accessor | Description |
223
+ |--------|---------|----------|-------------|
224
+ | `tag` | `:title` | `.title` | Simple element content |
225
+ | `tag#attr` | `:"media:content#url"` | `.media_content_url` | Attribute value |
226
+ | `tag+rel` | `:"link+alternate"` | `.link_alternate` | Element with specific `rel` attribute |
227
+
228
+ ### Collecting Multiple Values
229
+
230
+ By default, SimpleRSS returns only the first occurrence of each tag. To collect all values:
231
+
232
+ ```ruby
233
+ # Collect all categories for each item
234
+ feed = SimpleRSS.parse(xml, array_tags: [:category])
235
+
236
+ item.category # => ["tech", "programming", "ruby"]
237
+ ```
238
+
239
+ ## API Reference
240
+
241
+ ### `SimpleRSS.parse(source, options = {})`
242
+
243
+ Parse RSS/Atom content from a string or IO object.
244
+
245
+ **Parameters:**
246
+ - `source` - String or IO object containing feed XML
247
+ - `options` - Hash of options
248
+ - `:array_tags` - Array of tag symbols to collect as arrays
249
+
250
+ **Returns:** `SimpleRSS` instance
251
+
252
+ ### `SimpleRSS.fetch(url, options = {})`
253
+
254
+ Fetch and parse a feed from a URL.
255
+
256
+ **Parameters:**
257
+ - `url` - Feed URL string
258
+ - `options` - Hash of options
259
+ - `:timeout` - Request timeout in seconds
260
+ - `:etag` - ETag from previous request (for conditional GET)
261
+ - `:last_modified` - Last-Modified header from previous request
262
+ - `:follow_redirects` - Follow redirects (default: true)
263
+ - `:headers` - Hash of additional HTTP headers
264
+
265
+ **Returns:** `SimpleRSS` instance, or `nil` if 304 Not Modified
266
+
267
+ ### Instance Methods
268
+
269
+ | Method | Description |
270
+ |--------|-------------|
271
+ | `#channel` / `#feed` | Returns self (for RSS/Atom style access) |
272
+ | `#items` / `#entries` | Array of parsed items |
273
+ | `#each` | Iterate over items (includes `Enumerable`) |
274
+ | `#[](index)` | Access item by index |
275
+ | `#latest(n = 10)` | Get n most recent items by date |
276
+ | `#to_json` | JSON string representation |
277
+ | `#to_hash` / `#as_json` | Hash representation |
278
+ | `#to_xml(format:)` | XML string (`:rss2` or `:atom`) |
279
+ | `#etag` | ETag header from fetch (if applicable) |
280
+ | `#last_modified` | Last-Modified header from fetch (if applicable) |
281
+ | `#source` | Original source XML string |
282
+
283
+ ## Compatibility
284
+
285
+ - Ruby 3.1+
286
+ - No runtime dependencies
287
+
288
+ ## Development
289
+
290
+ ```bash
291
+ # Run tests
292
+ bundle exec rake test
293
+
294
+ # Run linter
295
+ bundle exec rubocop
296
+
297
+ # Type checking
298
+ bundle exec steep check
299
+
300
+ # Interactive console
301
+ bundle exec rake console
302
+ ```
303
+
304
+ ## Contributing
305
+
306
+ 1. Fork the repository
307
+ 2. Create a feature branch (`git checkout -b feature/my-feature`)
308
+ 3. Make your changes with tests
309
+ 4. Ensure tests pass (`bundle exec rake test`)
310
+ 5. Submit a pull request
311
+
312
+ ## Authors
313
+
314
+ - [Lucas Carlson](mailto:lucas@rufy.com)
315
+ - [Herval Freire](mailto:hervalfreire@gmail.com)
316
+
317
+ Inspired by [Blagg](http://www.raelity.org/lang/perl/blagg) by Rael Dornfest.
318
+
319
+ ## License
320
+
321
+ This library is released under the terms of the [GNU LGPL](LICENSE).