feedbag 0.10.3 → 1.0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (5) hide show
  1. checksums.yaml +4 -4
  2. data/COPYING +1 -1
  3. data/README.markdown +39 -19
  4. data/lib/feedbag.rb +70 -36
  5. metadata +54 -12
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: c5439ea287031a4c8aa34a87ade233f73c89acaffb8c03e05f4a1bc9fc1f9eb7
4
- data.tar.gz: 043fd828e69509b5a15260c8dc0e0991a70dbc2c01c7128176f74680b640d6f6
3
+ metadata.gz: aed91daea08560f45514eb565a8f5524a02bb39cff16acec30d4fcbf72113944
4
+ data.tar.gz: a873374c6b527f7d55c582fe7126e6c6d26f873828a4f5bf804316d7c113218c
5
5
  SHA512:
6
- metadata.gz: 9b3d6f64e975c623f8bc8046670fef214a0209cd09fb33e81b6a27e9684175f48fa950a1803609bb7334b31160df5cd107eaf3af525fcb423bba2b06f8986102
7
- data.tar.gz: a5c04f3e465f4e36c570e6a6f42f4c75246d4ef94fbb1e815530d1b9c9efd5e5cdbb7b248eca2ab97044ae18a7630b820aacdfcf8538721d59299ecfc40e82f7
6
+ metadata.gz: a512e9f8d6e3994d14812681631faa38eb38d164d09b8fe5c8adb2f754a4728460e93079d01048c89e65c4f929276268fde16bc88b18cfa4d9736b411d94b219
7
+ data.tar.gz: c8be3cae759ded40c08737dea5b3547c0f7a9e9fd5a2403bd5cda280bae8af69227edbb34ecfc1a72b17931acb35935605f9cd682808bdd575383ff05bf0144f
data/COPYING CHANGED
@@ -1,4 +1,4 @@
1
- Copyright (C) 2008-2021 David Moreno <damog@damog.net>
1
+ Copyright (C) 2008-2022 David Moreno <damog@damog.net> et al.
2
2
 
3
3
  Permission is hereby granted, free of charge, to any person obtaining
4
4
  a copy of this software and associated documentation files (the
data/README.markdown CHANGED
@@ -1,7 +1,7 @@
1
1
  Feedbag
2
2
  =======
3
3
 
4
- Feedbag is Ruby's favorite auto-discovery tool/library!
4
+ Feedbag is Ruby's favorite feed auto-discovery tool/library!
5
5
 
6
6
  ### Quick synopsis
7
7
 
@@ -10,9 +10,9 @@ Feedbag is Ruby's favorite auto-discovery tool/library!
10
10
  => true
11
11
  >> Feedbag.find "damog.net/blog"
12
12
  => ["http://damog.net/blog/atom.xml"]
13
- >> Feedbag.feed? "perl.org"
13
+ >> Feedbag.feed? "google.com"
14
14
  => false
15
- >> Feedbag.feed?("https://m.signalvnoise.com/feed")
15
+ >> Feedbag.feed?("https://daringfireball.net/feeds/main")
16
16
  => true
17
17
  ```
18
18
 
@@ -28,44 +28,64 @@ You can also use the command line tool for quick queries, if you install the gem
28
28
 
29
29
  » feedbag https://www.ruby-lang.org/en/
30
30
  == https://www.ruby-lang.org/en/:
31
- - https://www.ruby-lang.org/en/feeds/news.rss
31
+ - https://www.ruby-lang.org/en/feeds/news.rss
32
32
 
33
33
 
34
34
  ### Usage
35
- Feedbag will find all RSS feed types. Here's an example of finding ATOM and JSON Feed
35
+
36
+ Feedbag will find RSS, Atom, and JSON feed types:
36
37
 
37
38
  ```ruby
38
- > Feedbag.find('https://daringfireball.net')
39
- => ["https://daringfireball.net/feeds/main", "https://daringfireball.net/feeds/json", "https://daringfireball.net/linked/2021/02/17/bookfeed"]
39
+ >> Feedbag.find('https://daringfireball.net')
40
+ => ["https://daringfireball.net/feeds/main", "https://daringfireball.net/feeds/json"]
40
41
  ```
41
42
 
42
- Feedbag defaults to a User-Agent string of **Feedbag/1.10.2**, however you can override this
43
+ #### Custom User-Agent
44
+
45
+ Feedbag defaults to a User-Agent string of `Feedbag/VERSION`, but you can override it:
43
46
 
44
47
  ```ruby
45
- 0> Feedbag.find('https://kottke.org', 'User-Agent' => "My Personal Agent/1.0.1")
46
- => ["http://feeds.kottke.org/main", "http://feeds.kottke.org/json"]
47
- ````
48
+ >> Feedbag.find('https://kottke.org', 'User-Agent' => "My Personal Agent/1.0.1")
49
+ => ["http://feeds.kottke.org/main"]
50
+ ```
48
51
 
49
- The other options passed to find, will be passed to OpenURI. For example:
52
+ Other options passed to `find` will be forwarded to OpenURI:
50
53
 
51
54
  ```ruby
52
- Feedbag.find("https://kottke.org", 'User-Agent' => "My Personal Agent/1.0.1", open_timeout: 1000)
55
+ Feedbag.find("https://example.com", 'User-Agent' => "My Agent/1.0", open_timeout: 10)
53
56
  ```
54
57
 
55
- You can find the other options to OpenURI [here](https://rubyapi.org/o/openuri/openread#method-i-open).
58
+ See [OpenURI options](https://rubyapi.org/o/openuri/openread#method-i-open) for more details.
59
+
60
+ #### Custom Logger
56
61
 
62
+ By default, errors are written to `$stderr`. You can redirect them to a custom logger:
63
+
64
+ ```ruby
65
+ # Use Rails logger
66
+ Feedbag.logger = Rails.logger
67
+
68
+ # Or silence all output
69
+ Feedbag.logger = Logger.new('/dev/null')
70
+ ```
71
+
72
+ #### Non-ASCII URL Support
73
+
74
+ Feedbag handles internationalized URLs (IRIs) with non-ASCII characters:
75
+
76
+ ```ruby
77
+ >> Feedbag.find("https://example.com/中文/feed/")
78
+ # Works! URLs are automatically normalized
79
+ ```
57
80
 
58
81
  ### Why should you use it?
59
82
 
60
- - Because it only uses [Nokogiri](http://nokogiri.org/) as dependency.
83
+ - Because it only uses [Nokogiri](http://nokogiri.org/) and [Addressable](https://github.com/sporkmonger/addressable) as dependencies.
61
84
  - Because it follows modern feed filename conventions (like those ones used by WordPress blogs, or Blogger, etc).
62
85
  - Because it's a single file you can embed easily in your application.
86
+ - Because it handles international URLs with non-ASCII characters.
63
87
  - Because it's faster than anything else.
64
88
 
65
- ### Web Service
66
-
67
- Now you can also POST directly into an AWS Lambda function webservice that runs `Feedbag.find()`. Don't overuse it. It's [here](https://github.com/damog/aws-lambda-feedbag).
68
-
69
89
  ### Author
70
90
 
71
91
  [David Moreno](http://damog.net/) <[damog@damog.net](mailto:damog@damog.net)>.
data/lib/feedbag.rb CHANGED
@@ -1,33 +1,46 @@
1
1
  #!/usr/bin/ruby
2
2
 
3
- # Copyright (c) 2008-2019 David Moreno <damog@damog.net>
4
- #
5
- # Permission is hereby granted, free of charge, to any person obtaining
6
- # a copy of this software and associated documentation files (the
7
- # "Software"), to deal in the Software without restriction, including
8
- # without limitation the rights to use, copy, modify, merge, publish,
9
- # distribute, sublicense, and/or sell copies of the Software, and to
10
- # permit persons to whom the Software is furnished to do so, subject to
11
- # the following conditions:
12
- #
13
- # The above copyright notice and this permission notice shall be
14
- # included in all copies or substantial portions of the Software.
15
- #
16
- # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
17
- # EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
18
- # MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
19
- # NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
20
- # LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
21
- # OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
22
- # WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
3
+ # See COPYING before using this software.
23
4
 
24
5
  require "rubygems"
25
6
  require "nokogiri"
26
7
  require "open-uri"
27
8
  require "net/http"
9
+ require "logger"
10
+
11
+ begin
12
+ require "addressable/uri"
13
+ rescue LoadError
14
+ # addressable will be loaded after bundle install
15
+ end
28
16
 
29
17
  class Feedbag
30
- VERSION = '0.10.3'
18
+ VERSION = '1.0.1'
19
+
20
+ # Configurable logger for error output
21
+ # Default writes to $stderr. Can be set to Rails.logger or any Logger-compatible object.
22
+ #
23
+ # @example Silence all output
24
+ # Feedbag.logger = Logger.new('/dev/null')
25
+ #
26
+ # @example Use Rails logger
27
+ # Feedbag.logger = Rails.logger
28
+ #
29
+ class << self
30
+ attr_writer :logger
31
+
32
+ def logger
33
+ @logger ||= default_logger
34
+ end
35
+
36
+ private
37
+
38
+ def default_logger
39
+ logger = Logger.new($stderr)
40
+ logger.formatter = proc { |severity, _datetime, _progname, msg| "#{msg}\n" }
41
+ logger
42
+ end
43
+ end
31
44
  CONTENT_TYPES = [
32
45
  'application/x.atom+xml',
33
46
  'application/atom+xml',
@@ -44,7 +57,7 @@ class Feedbag
44
57
  end
45
58
 
46
59
  def self.find(url, options = {})
47
- new(options: options).find(url, **options)
60
+ new(options: options).find(url, options)
48
61
  end
49
62
 
50
63
  def initialize(options: nil)
@@ -53,9 +66,23 @@ class Feedbag
53
66
  @options["User-Agent"] ||= "Feedbag/#{VERSION}"
54
67
  end
55
68
 
69
+ # Normalize a URL to handle non-ASCII characters (IRIs)
70
+ # This converts internationalized URLs to valid ASCII URIs
71
+ def self.normalize_url(url)
72
+ return url if url.nil? || url.empty?
73
+ if defined?(Addressable::URI)
74
+ Addressable::URI.parse(url).normalize.to_s
75
+ else
76
+ url
77
+ end
78
+ rescue Addressable::URI::InvalidURIError
79
+ url
80
+ end
81
+
56
82
  def feed?(url)
57
- # use LWR::Simple.normalize some time
58
- url_uri = URI.parse(url)
83
+ # Normalize URL to handle non-ASCII characters
84
+ normalized_url = Feedbag.normalize_url(url)
85
+ url_uri = URI.parse(normalized_url)
59
86
  url = "#{url_uri.scheme or 'http'}://#{url_uri.host}#{url_uri.path}"
60
87
  url << "?#{url_uri.query}" if url_uri.query
61
88
 
@@ -71,7 +98,9 @@ class Feedbag
71
98
  end
72
99
 
73
100
  def find(url, options = {})
74
- url_uri = URI.parse(url)
101
+ # Normalize URL to handle non-ASCII characters
102
+ normalized_url = Feedbag.normalize_url(url)
103
+ url_uri = URI.parse(normalized_url)
75
104
  url = nil
76
105
  if url_uri.scheme.nil?
77
106
  url = "http://#{url_uri.to_s}"
@@ -95,11 +124,11 @@ class Feedbag
95
124
  # TODO: actually find out timeout. use Terminator?
96
125
  # $stderr.puts "Feed looked like feed but might not have passed validation or timed out"
97
126
  rescue => ex
98
- $stderr.puts "#{ex.class} error occurred with: `#{url}': #{ex.message}"
127
+ Feedbag.logger.error "#{ex.class} error occurred with: `#{url}': #{ex.message}"
99
128
  end
100
129
 
101
130
  begin
102
- html = URI.open(url, **@options) do |f|
131
+ html = URI.open(url, @options) do |f|
103
132
  content_type = f.content_type.downcase
104
133
  if content_type == "application/octet-stream" # open failed
105
134
  content_type = f.meta["content-type"].gsub(/;.*$/, '')
@@ -154,13 +183,13 @@ class Feedbag
154
183
  end
155
184
  end
156
185
  rescue Timeout::Error => err
157
- $stderr.puts "Timeout error occurred with `#{url}: #{err}'"
186
+ Feedbag.logger.error "Timeout error occurred with `#{url}: #{err}'"
158
187
  rescue OpenURI::HTTPError => the_error
159
- $stderr.puts "Error occurred with `#{url}': #{the_error}"
188
+ Feedbag.logger.error "Error occurred with `#{url}': #{the_error}"
160
189
  rescue SocketError => err
161
- $stderr.puts "Socket error occurred with: `#{url}': #{err}"
190
+ Feedbag.logger.error "Socket error occurred with: `#{url}': #{err}"
162
191
  rescue => ex
163
- $stderr.puts "#{ex.class} error occurred with: `#{url}': #{ex.message}"
192
+ Feedbag.logger.error "#{ex.class} error occurred with: `#{url}': #{ex.message}"
164
193
  ensure
165
194
  return @feeds
166
195
  end
@@ -179,19 +208,24 @@ class Feedbag
179
208
  # puts "#{feed_url} - #{orig_url}"
180
209
  url = feed_url.sub(/^feed:/, '').strip
181
210
 
211
+ # Normalize URL to handle non-ASCII characters
212
+ url = Feedbag.normalize_url(url)
213
+
182
214
  if base_uri
183
215
  # url = base_uri + feed_url
184
- url = URI.parse(base_uri).merge(feed_url).to_s
216
+ normalized_base = Feedbag.normalize_url(base_uri)
217
+ url = URI.parse(normalized_base).merge(url).to_s
185
218
  end
186
219
 
187
220
  begin
188
221
  uri = URI.parse(url)
189
- rescue
190
- puts "Error with `#{url}'"
191
- exit 1
222
+ rescue => ex
223
+ Feedbag.logger.error "Error parsing URL `#{url}': #{ex.message}"
224
+ return
192
225
  end
193
226
  unless uri.absolute?
194
- orig = URI.parse(orig_url)
227
+ normalized_orig = Feedbag.normalize_url(orig_url)
228
+ orig = URI.parse(normalized_orig)
195
229
  url = orig.merge(url).to_s
196
230
  end
197
231
 
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: feedbag
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.10.3
4
+ version: 1.0.1
5
5
  platform: ruby
6
6
  authors:
7
7
  - David Moreno
8
- autorequire:
8
+ autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2020-11-28 00:00:00.000000000 Z
11
+ date: 2025-11-29 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: nokogiri
@@ -30,6 +30,20 @@ dependencies:
30
30
  - - ">="
31
31
  - !ruby/object:Gem::Version
32
32
  version: 1.8.2
33
+ - !ruby/object:Gem::Dependency
34
+ name: addressable
35
+ requirement: !ruby/object:Gem::Requirement
36
+ requirements:
37
+ - - "~>"
38
+ - !ruby/object:Gem::Version
39
+ version: '2.8'
40
+ type: :runtime
41
+ prerelease: false
42
+ version_requirements: !ruby/object:Gem::Requirement
43
+ requirements:
44
+ - - "~>"
45
+ - !ruby/object:Gem::Version
46
+ version: '2.8'
33
47
  - !ruby/object:Gem::Dependency
34
48
  name: shoulda
35
49
  requirement: !ruby/object:Gem::Requirement
@@ -48,22 +62,22 @@ dependencies:
48
62
  name: mocha
49
63
  requirement: !ruby/object:Gem::Requirement
50
64
  requirements:
51
- - - "~>"
52
- - !ruby/object:Gem::Version
53
- version: '0.12'
54
65
  - - ">="
55
66
  - !ruby/object:Gem::Version
56
67
  version: 0.12.0
68
+ - - "~>"
69
+ - !ruby/object:Gem::Version
70
+ version: '0.12'
57
71
  type: :development
58
72
  prerelease: false
59
73
  version_requirements: !ruby/object:Gem::Requirement
60
74
  requirements:
61
- - - "~>"
62
- - !ruby/object:Gem::Version
63
- version: '0.12'
64
75
  - - ">="
65
76
  - !ruby/object:Gem::Version
66
77
  version: 0.12.0
78
+ - - "~>"
79
+ - !ruby/object:Gem::Version
80
+ version: '0.12'
67
81
  - !ruby/object:Gem::Dependency
68
82
  name: webmock
69
83
  requirement: !ruby/object:Gem::Requirement
@@ -92,6 +106,34 @@ dependencies:
92
106
  - - "~>"
93
107
  - !ruby/object:Gem::Version
94
108
  version: '11'
109
+ - !ruby/object:Gem::Dependency
110
+ name: rake
111
+ requirement: !ruby/object:Gem::Requirement
112
+ requirements:
113
+ - - "~>"
114
+ - !ruby/object:Gem::Version
115
+ version: '12'
116
+ type: :development
117
+ prerelease: false
118
+ version_requirements: !ruby/object:Gem::Requirement
119
+ requirements:
120
+ - - "~>"
121
+ - !ruby/object:Gem::Version
122
+ version: '12'
123
+ - !ruby/object:Gem::Dependency
124
+ name: test-unit
125
+ requirement: !ruby/object:Gem::Requirement
126
+ requirements:
127
+ - - "~>"
128
+ - !ruby/object:Gem::Version
129
+ version: '3'
130
+ type: :development
131
+ prerelease: false
132
+ version_requirements: !ruby/object:Gem::Requirement
133
+ requirements:
134
+ - - "~>"
135
+ - !ruby/object:Gem::Version
136
+ version: '3'
95
137
  description: Ruby's favorite feed auto-discovery tool
96
138
  email: damog@damog.net
97
139
  executables:
@@ -110,7 +152,7 @@ homepage: http://github.com/damog/feedbag
110
152
  licenses:
111
153
  - MIT
112
154
  metadata: {}
113
- post_install_message:
155
+ post_install_message:
114
156
  rdoc_options:
115
157
  - "--main"
116
158
  - README.markdown
@@ -127,8 +169,8 @@ required_rubygems_version: !ruby/object:Gem::Requirement
127
169
  - !ruby/object:Gem::Version
128
170
  version: '0'
129
171
  requirements: []
130
- rubygems_version: 3.1.4
131
- signing_key:
172
+ rubygems_version: 3.0.3.1
173
+ signing_key:
132
174
  specification_version: 4
133
175
  summary: RSS/Atom feed auto-discovery tool
134
176
  test_files: []