source_monitor 0.7.0 → 0.8.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (141) hide show
  1. checksums.yaml +4 -4
  2. data/.claude/commands/release.md +45 -22
  3. data/.claude/skills/sm-configure/SKILL.md +10 -1
  4. data/.claude/skills/sm-configure/reference/configuration-reference.md +44 -0
  5. data/.claude/skills/sm-host-setup/reference/initializer-template.md +17 -0
  6. data/.claude/skills/sm-host-setup/reference/setup-checklist.md +2 -0
  7. data/.claude/skills/sm-job/reference/job-conventions.md +26 -0
  8. data/.claude/skills/sm-upgrade/reference/version-history.md +22 -0
  9. data/.gitignore +10 -0
  10. data/AGENTS.md +1 -1
  11. data/CHANGELOG.md +56 -0
  12. data/CLAUDE.md +11 -5
  13. data/Gemfile.lock +1 -1
  14. data/README.md +6 -4
  15. data/VERSION +1 -1
  16. data/app/assets/builds/source_monitor/application.css +43 -0
  17. data/app/assets/builds/source_monitor/application.js +127 -0
  18. data/app/assets/builds/source_monitor/application.js.map +3 -3
  19. data/app/assets/javascripts/source_monitor/application.js +2 -0
  20. data/app/assets/javascripts/source_monitor/controllers/notification_container_controller.js +138 -0
  21. data/app/assets/javascripts/source_monitor/controllers/notification_controller.js +11 -0
  22. data/app/controllers/source_monitor/source_favicon_fetches_controller.rb +38 -0
  23. data/app/controllers/source_monitor/sources_controller.rb +11 -0
  24. data/app/helpers/source_monitor/application_helper.rb +51 -0
  25. data/app/jobs/source_monitor/favicon_fetch_job.rb +71 -0
  26. data/app/jobs/source_monitor/import_opml_job.rb +9 -0
  27. data/app/jobs/source_monitor/source_health_check_job.rb +10 -0
  28. data/app/models/source_monitor/source.rb +2 -0
  29. data/app/views/layouts/source_monitor/application.html.erb +23 -2
  30. data/app/views/source_monitor/shared/_toast.html.erb +1 -0
  31. data/app/views/source_monitor/sources/_details.html.erb +34 -5
  32. data/app/views/source_monitor/sources/_row.html.erb +11 -6
  33. data/config/routes.rb +1 -0
  34. data/docs/configuration.md +1 -1
  35. data/docs/upgrade.md +22 -0
  36. data/lib/generators/source_monitor/install/templates/source_monitor.rb.tt +15 -1
  37. data/lib/source_monitor/configuration/favicons_settings.rb +42 -0
  38. data/lib/source_monitor/configuration/http_settings.rb +1 -1
  39. data/lib/source_monitor/configuration/scraping_settings.rb +1 -1
  40. data/lib/source_monitor/configuration.rb +3 -1
  41. data/lib/source_monitor/favicons/discoverer.rb +196 -0
  42. data/lib/source_monitor/fetching/feed_fetcher/source_updater.rb +21 -0
  43. data/lib/source_monitor/fetching/feed_fetcher.rb +1 -0
  44. data/lib/source_monitor/http.rb +5 -3
  45. data/lib/source_monitor/version.rb +1 -1
  46. data/lib/source_monitor.rb +4 -0
  47. data/lib/tasks/test_fast.rake +11 -0
  48. data/source_monitor.gemspec +1 -1
  49. metadata +7 -93
  50. data/.vbw-planning/PROJECT.md +0 -51
  51. data/.vbw-planning/ROADMAP.md +0 -32
  52. data/.vbw-planning/SHIPPED.md +0 -63
  53. data/.vbw-planning/STATE.md +0 -27
  54. data/.vbw-planning/codebase/ARCHITECTURE.md +0 -147
  55. data/.vbw-planning/codebase/CONCERNS.md +0 -99
  56. data/.vbw-planning/codebase/CONVENTIONS.md +0 -97
  57. data/.vbw-planning/codebase/DEPENDENCIES.md +0 -100
  58. data/.vbw-planning/codebase/INDEX.md +0 -86
  59. data/.vbw-planning/codebase/META.md +0 -42
  60. data/.vbw-planning/codebase/PATTERNS.md +0 -262
  61. data/.vbw-planning/codebase/STACK.md +0 -101
  62. data/.vbw-planning/codebase/STRUCTURE.md +0 -324
  63. data/.vbw-planning/codebase/TESTING.md +0 -154
  64. data/.vbw-planning/config.json +0 -53
  65. data/.vbw-planning/discovery.json +0 -26
  66. data/.vbw-planning/milestones/default/ROADMAP.md +0 -115
  67. data/.vbw-planning/milestones/default/STATE.md +0 -82
  68. data/.vbw-planning/milestones/default/phases/01-coverage-analysis-quick-wins/PLAN-01-SUMMARY.md +0 -56
  69. data/.vbw-planning/milestones/default/phases/01-coverage-analysis-quick-wins/PLAN-01.md +0 -187
  70. data/.vbw-planning/milestones/default/phases/01-coverage-analysis-quick-wins/PLAN-02-SUMMARY.md +0 -64
  71. data/.vbw-planning/milestones/default/phases/01-coverage-analysis-quick-wins/PLAN-02.md +0 -137
  72. data/.vbw-planning/milestones/default/phases/02-critical-path-test-coverage/PLAN-01-SUMMARY.md +0 -67
  73. data/.vbw-planning/milestones/default/phases/02-critical-path-test-coverage/PLAN-01.md +0 -142
  74. data/.vbw-planning/milestones/default/phases/02-critical-path-test-coverage/PLAN-02-SUMMARY.md +0 -64
  75. data/.vbw-planning/milestones/default/phases/02-critical-path-test-coverage/PLAN-02.md +0 -138
  76. data/.vbw-planning/milestones/default/phases/02-critical-path-test-coverage/PLAN-03-SUMMARY.md +0 -85
  77. data/.vbw-planning/milestones/default/phases/02-critical-path-test-coverage/PLAN-03.md +0 -147
  78. data/.vbw-planning/milestones/default/phases/02-critical-path-test-coverage/PLAN-04-SUMMARY.md +0 -63
  79. data/.vbw-planning/milestones/default/phases/02-critical-path-test-coverage/PLAN-04.md +0 -129
  80. data/.vbw-planning/milestones/default/phases/02-critical-path-test-coverage/PLAN-05-SUMMARY.md +0 -74
  81. data/.vbw-planning/milestones/default/phases/02-critical-path-test-coverage/PLAN-05.md +0 -154
  82. data/.vbw-planning/milestones/default/phases/03-large-file-refactoring/03-VERIFICATION-wave1.md +0 -303
  83. data/.vbw-planning/milestones/default/phases/03-large-file-refactoring/03-VERIFICATION.md +0 -510
  84. data/.vbw-planning/milestones/default/phases/03-large-file-refactoring/PLAN-01-SUMMARY.md +0 -61
  85. data/.vbw-planning/milestones/default/phases/03-large-file-refactoring/PLAN-01.md +0 -161
  86. data/.vbw-planning/milestones/default/phases/03-large-file-refactoring/PLAN-02-SUMMARY.md +0 -66
  87. data/.vbw-planning/milestones/default/phases/03-large-file-refactoring/PLAN-02.md +0 -132
  88. data/.vbw-planning/milestones/default/phases/03-large-file-refactoring/PLAN-03-SUMMARY.md +0 -59
  89. data/.vbw-planning/milestones/default/phases/03-large-file-refactoring/PLAN-03.md +0 -171
  90. data/.vbw-planning/milestones/default/phases/03-large-file-refactoring/PLAN-04-SUMMARY.md +0 -56
  91. data/.vbw-planning/milestones/default/phases/03-large-file-refactoring/PLAN-04.md +0 -152
  92. data/.vbw-planning/milestones/default/phases/04-code-quality-conventions-cleanup/04-CONTEXT.md +0 -33
  93. data/.vbw-planning/milestones/default/phases/04-code-quality-conventions-cleanup/PLAN-01-SUMMARY.md +0 -42
  94. data/.vbw-planning/milestones/default/phases/04-code-quality-conventions-cleanup/PLAN-01.md +0 -119
  95. data/.vbw-planning/milestones/default/phases/04-code-quality-conventions-cleanup/PLAN-02-SUMMARY.md +0 -52
  96. data/.vbw-planning/milestones/default/phases/04-code-quality-conventions-cleanup/PLAN-02.md +0 -195
  97. data/.vbw-planning/milestones/default/phases/04-code-quality-conventions-cleanup/PLAN-03-SUMMARY.md +0 -79
  98. data/.vbw-planning/milestones/default/phases/04-code-quality-conventions-cleanup/PLAN-03.md +0 -130
  99. data/.vbw-planning/milestones/generator-enhancements/REQUIREMENTS.md +0 -72
  100. data/.vbw-planning/milestones/generator-enhancements/ROADMAP.md +0 -125
  101. data/.vbw-planning/milestones/generator-enhancements/SHIPPED.md +0 -40
  102. data/.vbw-planning/milestones/generator-enhancements/STATE.md +0 -43
  103. data/.vbw-planning/milestones/generator-enhancements/phases/01-generator-steps/01-CONTEXT.md +0 -33
  104. data/.vbw-planning/milestones/generator-enhancements/phases/01-generator-steps/01-VERIFICATION.md +0 -86
  105. data/.vbw-planning/milestones/generator-enhancements/phases/01-generator-steps/PLAN-01-SUMMARY.md +0 -61
  106. data/.vbw-planning/milestones/generator-enhancements/phases/01-generator-steps/PLAN-01.md +0 -380
  107. data/.vbw-planning/milestones/generator-enhancements/phases/02-verification/02-VERIFICATION.md +0 -78
  108. data/.vbw-planning/milestones/generator-enhancements/phases/02-verification/PLAN-01-SUMMARY.md +0 -46
  109. data/.vbw-planning/milestones/generator-enhancements/phases/02-verification/PLAN-01.md +0 -500
  110. data/.vbw-planning/milestones/generator-enhancements/phases/03-docs-alignment/03-VERIFICATION.md +0 -89
  111. data/.vbw-planning/milestones/generator-enhancements/phases/03-docs-alignment/PLAN-01-SUMMARY.md +0 -48
  112. data/.vbw-planning/milestones/generator-enhancements/phases/03-docs-alignment/PLAN-01.md +0 -456
  113. data/.vbw-planning/milestones/generator-enhancements/phases/04-dashboard-ux/04-VERIFICATION.md +0 -129
  114. data/.vbw-planning/milestones/generator-enhancements/phases/04-dashboard-ux/PLAN-01-SUMMARY.md +0 -70
  115. data/.vbw-planning/milestones/generator-enhancements/phases/04-dashboard-ux/PLAN-01.md +0 -747
  116. data/.vbw-planning/milestones/generator-enhancements/phases/05-active-storage-images/05-VERIFICATION.md +0 -156
  117. data/.vbw-planning/milestones/generator-enhancements/phases/05-active-storage-images/PLAN-01-SUMMARY.md +0 -69
  118. data/.vbw-planning/milestones/generator-enhancements/phases/05-active-storage-images/PLAN-01.md +0 -455
  119. data/.vbw-planning/milestones/generator-enhancements/phases/05-active-storage-images/PLAN-02-SUMMARY.md +0 -39
  120. data/.vbw-planning/milestones/generator-enhancements/phases/05-active-storage-images/PLAN-02.md +0 -488
  121. data/.vbw-planning/milestones/generator-enhancements/phases/06-netflix-feed-fix/06-VERIFICATION.md +0 -100
  122. data/.vbw-planning/milestones/generator-enhancements/phases/06-netflix-feed-fix/PLAN-01-SUMMARY.md +0 -37
  123. data/.vbw-planning/milestones/generator-enhancements/phases/06-netflix-feed-fix/PLAN-01.md +0 -345
  124. data/.vbw-planning/milestones/upgrade-assurance/REQUIREMENTS.md +0 -80
  125. data/.vbw-planning/milestones/upgrade-assurance/ROADMAP.md +0 -75
  126. data/.vbw-planning/milestones/upgrade-assurance/STATE.md +0 -29
  127. data/.vbw-planning/milestones/upgrade-assurance/phases/01-upgrade-command/01-VERIFICATION.md +0 -144
  128. data/.vbw-planning/milestones/upgrade-assurance/phases/01-upgrade-command/PLAN-01-SUMMARY.md +0 -43
  129. data/.vbw-planning/milestones/upgrade-assurance/phases/01-upgrade-command/PLAN-01.md +0 -405
  130. data/.vbw-planning/milestones/upgrade-assurance/phases/02-config-deprecation/PLAN-01-SUMMARY.md +0 -27
  131. data/.vbw-planning/milestones/upgrade-assurance/phases/02-config-deprecation/PLAN-01.md +0 -303
  132. data/.vbw-planning/milestones/upgrade-assurance/phases/03-upgrade-skill-docs/03-VERIFICATION.md +0 -380
  133. data/.vbw-planning/milestones/upgrade-assurance/phases/03-upgrade-skill-docs/PLAN-01-SUMMARY.md +0 -36
  134. data/.vbw-planning/milestones/upgrade-assurance/phases/03-upgrade-skill-docs/PLAN-01.md +0 -652
  135. data/.vbw-planning/phases/01-aia-certificate-resolution/.context-dev.md +0 -17
  136. data/.vbw-planning/phases/01-aia-certificate-resolution/PLAN-01-SUMMARY.md +0 -26
  137. data/.vbw-planning/phases/01-aia-certificate-resolution/PLAN-01.md +0 -71
  138. data/.vbw-planning/phases/01-aia-certificate-resolution/PLAN-02-SUMMARY.md +0 -16
  139. data/.vbw-planning/phases/01-aia-certificate-resolution/PLAN-02.md +0 -56
  140. data/.vbw-planning/phases/01-aia-certificate-resolution/PLAN-03-SUMMARY.md +0 -17
  141. data/.vbw-planning/phases/01-aia-certificate-resolution/PLAN-03.md +0 -98
@@ -0,0 +1,42 @@
1
+ # frozen_string_literal: true
2
+
3
+ module SourceMonitor
4
+ class Configuration
5
+ class FaviconsSettings
6
+ attr_accessor :enabled,
7
+ :fetch_timeout,
8
+ :max_download_size,
9
+ :retry_cooldown_days,
10
+ :allowed_content_types
11
+
12
+ DEFAULT_FETCH_TIMEOUT = 5 # seconds
13
+ DEFAULT_MAX_DOWNLOAD_SIZE = 1 * 1024 * 1024 # 1 MB
14
+ DEFAULT_RETRY_COOLDOWN_DAYS = 7
15
+ DEFAULT_ALLOWED_CONTENT_TYPES = %w[
16
+ image/x-icon
17
+ image/vnd.microsoft.icon
18
+ image/png
19
+ image/jpeg
20
+ image/gif
21
+ image/svg+xml
22
+ image/webp
23
+ ].freeze
24
+
25
+ def initialize
26
+ reset!
27
+ end
28
+
29
+ def reset!
30
+ @enabled = true
31
+ @fetch_timeout = DEFAULT_FETCH_TIMEOUT
32
+ @max_download_size = DEFAULT_MAX_DOWNLOAD_SIZE
33
+ @retry_cooldown_days = DEFAULT_RETRY_COOLDOWN_DAYS
34
+ @allowed_content_types = DEFAULT_ALLOWED_CONTENT_TYPES.dup
35
+ end
36
+
37
+ def enabled?
38
+ !!enabled && !!defined?(ActiveStorage)
39
+ end
40
+ end
41
+ end
42
+ end
@@ -42,7 +42,7 @@ module SourceMonitor
42
42
  private
43
43
 
44
44
  def default_user_agent
45
- "SourceMonitor/#{SourceMonitor::VERSION}"
45
+ "Mozilla/5.0 (compatible; SourceMonitor/#{SourceMonitor::VERSION})"
46
46
  end
47
47
  end
48
48
  end
@@ -5,7 +5,7 @@ module SourceMonitor
5
5
  class ScrapingSettings
6
6
  attr_accessor :max_in_flight_per_source, :max_bulk_batch_size
7
7
 
8
- DEFAULT_MAX_IN_FLIGHT = 25
8
+ DEFAULT_MAX_IN_FLIGHT = nil
9
9
  DEFAULT_MAX_BULK_BATCH_SIZE = 100
10
10
 
11
11
  def initialize
@@ -9,6 +9,7 @@ require "source_monitor/configuration/realtime_settings"
9
9
  require "source_monitor/configuration/retention_settings"
10
10
  require "source_monitor/configuration/authentication_settings"
11
11
  require "source_monitor/configuration/images_settings"
12
+ require "source_monitor/configuration/favicons_settings"
12
13
  require "source_monitor/configuration/scraper_registry"
13
14
  require "source_monitor/configuration/events"
14
15
  require "source_monitor/configuration/validation_definition"
@@ -28,7 +29,7 @@ module SourceMonitor
28
29
  :mission_control_enabled,
29
30
  :mission_control_dashboard_path
30
31
 
31
- attr_reader :http, :scrapers, :retention, :events, :models, :realtime, :fetching, :health, :authentication, :scraping, :images
32
+ attr_reader :http, :scrapers, :retention, :events, :models, :realtime, :fetching, :health, :authentication, :scraping, :images, :favicons
32
33
 
33
34
  DEFAULT_QUEUE_NAMESPACE = "source_monitor"
34
35
 
@@ -53,6 +54,7 @@ module SourceMonitor
53
54
  @authentication = AuthenticationSettings.new
54
55
  @scraping = ScrapingSettings.new
55
56
  @images = ImagesSettings.new
57
+ @favicons = FaviconsSettings.new
56
58
  end
57
59
 
58
60
  def queue_name_for(role)
@@ -0,0 +1,196 @@
1
+ # frozen_string_literal: true
2
+
3
+ require "faraday"
4
+ require "faraday/follow_redirects"
5
+ require "securerandom"
6
+ require "nokogiri"
7
+
8
+ module SourceMonitor
9
+ module Favicons
10
+ class Discoverer
11
+ Result = Struct.new(:io, :filename, :content_type, :url, keyword_init: true)
12
+
13
+ attr_reader :website_url, :settings
14
+
15
+ def initialize(website_url, settings: nil)
16
+ @website_url = website_url
17
+ @settings = settings || SourceMonitor.config.favicons
18
+ end
19
+
20
+ def call
21
+ return if website_url.blank?
22
+
23
+ try_html_link_tags || try_google_favicon_api || try_favicon_ico
24
+ rescue Faraday::Error, URI::InvalidURIError, Timeout::Error
25
+ nil
26
+ end
27
+
28
+ private
29
+
30
+ def try_favicon_ico
31
+ uri = URI.parse(website_url)
32
+ favicon_url = "#{uri.scheme}://#{uri.host}/favicon.ico"
33
+ download_favicon(favicon_url)
34
+ rescue URI::InvalidURIError
35
+ nil
36
+ end
37
+
38
+ def try_html_link_tags
39
+ response = html_client.get(website_url)
40
+ return unless response.status == 200
41
+
42
+ doc = Nokogiri::HTML(response.body)
43
+ candidates = extract_icon_candidates(doc)
44
+ return if candidates.empty?
45
+
46
+ candidates.each do |candidate_url|
47
+ result = download_favicon(candidate_url)
48
+ return result if result
49
+ end
50
+ nil
51
+ rescue Faraday::Error, Nokogiri::SyntaxError
52
+ nil
53
+ end
54
+
55
+ def try_google_favicon_api
56
+ uri = URI.parse(website_url)
57
+ api_url = "https://www.google.com/s2/favicons?domain=#{uri.host}&sz=64"
58
+ download_favicon(api_url)
59
+ rescue URI::InvalidURIError
60
+ nil
61
+ end
62
+
63
+ def extract_icon_candidates(doc)
64
+ candidates = []
65
+
66
+ # Search link[rel] tags for icon types
67
+ icon_selectors = [
68
+ 'link[rel*="icon"]',
69
+ 'link[rel="apple-touch-icon"]',
70
+ 'link[rel="apple-touch-icon-precomposed"]',
71
+ 'link[rel="mask-icon"]'
72
+ ]
73
+
74
+ icon_selectors.each do |selector|
75
+ doc.css(selector).each do |link|
76
+ href = link["href"]
77
+ next if href.blank?
78
+
79
+ absolute_url = resolve_url(href)
80
+ next unless absolute_url
81
+
82
+ sizes = parse_sizes(link["sizes"])
83
+ candidates << { url: absolute_url, size: sizes }
84
+ end
85
+ end
86
+
87
+ # Search meta tags for msapplication-TileImage
88
+ doc.css('meta[name="msapplication-TileImage"]').each do |meta|
89
+ content = meta["content"]
90
+ next if content.blank?
91
+
92
+ absolute_url = resolve_url(content)
93
+ candidates << { url: absolute_url, size: 0 } if absolute_url
94
+ end
95
+
96
+ # og:image as last resort
97
+ doc.css('meta[property="og:image"]').each do |meta|
98
+ content = meta["content"]
99
+ next if content.blank?
100
+
101
+ absolute_url = resolve_url(content)
102
+ candidates << { url: absolute_url, size: -1 } if absolute_url
103
+ end
104
+
105
+ # Sort by size descending (prefer larger), deduplicate by URL
106
+ candidates
107
+ .sort_by { |c| -(c[:size] || 0) }
108
+ .uniq { |c| c[:url] }
109
+ .map { |c| c[:url] }
110
+ end
111
+
112
+ def parse_sizes(sizes_attr)
113
+ return 0 if sizes_attr.blank?
114
+ return 0 if sizes_attr.casecmp("any").zero?
115
+
116
+ # Parse "32x32", "256x256", etc. -- take the max dimension
117
+ match = sizes_attr.match(/(\d+)x(\d+)/i)
118
+ return 0 unless match
119
+
120
+ [ match[1].to_i, match[2].to_i ].max
121
+ end
122
+
123
+ def resolve_url(href)
124
+ return nil if href.blank?
125
+
126
+ uri = URI.parse(href)
127
+ if uri.absolute?
128
+ href
129
+ else
130
+ URI.join(website_url, href).to_s
131
+ end
132
+ rescue URI::InvalidURIError, URI::BadURIError
133
+ nil
134
+ end
135
+
136
+ def download_favicon(url)
137
+ response = image_client.get(url)
138
+ return unless response.status == 200
139
+
140
+ content_type = response.headers["content-type"]&.split(";")&.first&.strip&.downcase
141
+ return unless content_type && settings.allowed_content_types.include?(content_type)
142
+
143
+ body = response.body
144
+ return unless body && body.bytesize > 0
145
+ return if body.bytesize > settings.max_download_size
146
+
147
+ filename = derive_filename(url, content_type)
148
+
149
+ Result.new(
150
+ io: StringIO.new(body),
151
+ filename: filename,
152
+ content_type: content_type,
153
+ url: url
154
+ )
155
+ rescue Faraday::Error
156
+ nil
157
+ end
158
+
159
+ def derive_filename(favicon_url, content_type)
160
+ uri = URI.parse(favicon_url)
161
+ basename = File.basename(uri.path) if uri.path.present?
162
+
163
+ if basename.present? && basename.include?(".")
164
+ basename
165
+ else
166
+ ext = Rack::Mime::MIME_TYPES.invert[content_type] || ".ico"
167
+ "favicon-#{SecureRandom.hex(8)}#{ext}"
168
+ end
169
+ rescue URI::InvalidURIError
170
+ ext = Rack::Mime::MIME_TYPES.invert[content_type] || ".ico"
171
+ "favicon-#{SecureRandom.hex(8)}#{ext}"
172
+ end
173
+
174
+ def html_client
175
+ build_client("text/html, application/xhtml+xml")
176
+ end
177
+
178
+ def image_client
179
+ build_client("image/*")
180
+ end
181
+
182
+ def build_client(accept_header)
183
+ timeout = settings.fetch_timeout
184
+
185
+ Faraday.new do |f|
186
+ f.response :follow_redirects, limit: 3
187
+ f.options.timeout = timeout
188
+ f.options.open_timeout = [ timeout / 2, 3 ].min
189
+ f.headers["User-Agent"] = SourceMonitor.config.http.user_agent || "SourceMonitor/#{SourceMonitor::VERSION}"
190
+ f.headers["Accept"] = accept_header
191
+ f.adapter Faraday.default_adapter
192
+ end
193
+ end
194
+ end
195
+ end
196
+ end
@@ -37,6 +37,7 @@ module SourceMonitor
37
37
  attributes[:metadata] = updated_metadata(feed_signature: feed_signature, entries_digest: entries_digest)
38
38
  reset_retry_state!(attributes)
39
39
  source.update!(attributes)
40
+ enqueue_favicon_fetch_if_needed
40
41
  end
41
42
 
42
43
  def update_source_for_not_modified(response, duration_ms)
@@ -62,6 +63,7 @@ module SourceMonitor
62
63
  attributes[:metadata] = updated_metadata
63
64
  reset_retry_state!(attributes)
64
65
  source.update!(attributes)
66
+ enqueue_favicon_fetch_if_needed
65
67
  end
66
68
 
67
69
  def update_source_for_failure(error, duration_ms)
@@ -137,6 +139,25 @@ module SourceMonitor
137
139
  attributes[:fetch_circuit_until] = nil
138
140
  end
139
141
 
142
+ def enqueue_favicon_fetch_if_needed
143
+ return unless defined?(ActiveStorage)
144
+ return unless SourceMonitor.config.favicons.enabled?
145
+ return if source.website_url.blank?
146
+ return if source.respond_to?(:favicon) && source.favicon.attached?
147
+
148
+ last_attempt = source.metadata&.dig("favicon_last_attempted_at")
149
+ if last_attempt.present?
150
+ cooldown_days = SourceMonitor.config.favicons.retry_cooldown_days
151
+ return if Time.parse(last_attempt) > cooldown_days.days.ago
152
+ end
153
+
154
+ SourceMonitor::FaviconFetchJob.perform_later(source.id)
155
+ rescue StandardError => error
156
+ Rails.logger.warn(
157
+ "[SourceMonitor::SourceUpdater] Failed to enqueue favicon fetch for source #{source.id}: #{error.message}"
158
+ ) if defined?(Rails) && Rails.respond_to?(:logger) && Rails.logger
159
+ end
160
+
140
161
  def apply_retry_strategy!(attributes, error, now)
141
162
  decision = SourceMonitor::Fetching::RetryPolicy.new(source:, error:, now:).decision
142
163
 
@@ -103,6 +103,7 @@ module SourceMonitor
103
103
 
104
104
  def request_headers
105
105
  headers = (source.custom_headers || {}).transform_keys { |key| key.to_s }
106
+ headers["Referer"] = source.website_url if source.website_url.present?
106
107
  headers["If-None-Match"] = source.etag if source.etag.present?
107
108
  if source.last_modified.present?
108
109
  headers["If-Modified-Since"] = source.last_modified.httpdate
@@ -14,7 +14,7 @@ module SourceMonitor
14
14
  DEFAULT_TIMEOUT = 15
15
15
  DEFAULT_OPEN_TIMEOUT = 5
16
16
  DEFAULT_MAX_REDIRECTS = 5
17
- DEFAULT_USER_AGENT = "SourceMonitor/#{SourceMonitor::VERSION}"
17
+ DEFAULT_USER_AGENT = "Mozilla/5.0 (compatible; SourceMonitor/#{SourceMonitor::VERSION})"
18
18
  RETRY_STATUSES = [ 429, 500, 502, 503, 504 ].freeze
19
19
 
20
20
  class << self
@@ -89,8 +89,10 @@ module SourceMonitor
89
89
  def default_headers(settings)
90
90
  base_headers = {
91
91
  "User-Agent" => resolve_callable(settings.user_agent).presence || DEFAULT_USER_AGENT,
92
- "Accept" => "application/rss+xml, application/atom+xml, application/json;q=0.9, text/xml;q=0.8",
93
- "Accept-Encoding" => "gzip,deflate"
92
+ "Accept" => "text/html, application/rss+xml, application/atom+xml, application/json;q=0.9, text/xml;q=0.8",
93
+ "Accept-Encoding" => "gzip,deflate",
94
+ "Accept-Language" => "en-US,en;q=0.9",
95
+ "DNT" => "1"
94
96
  }
95
97
 
96
98
  base_headers.merge(settings.headers || {})
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module SourceMonitor
4
- VERSION = "0.7.0"
4
+ VERSION = "0.8.0"
5
5
  end
@@ -87,6 +87,10 @@ module SourceMonitor
87
87
  autoload :HealthCheckBroadcaster, "source_monitor/import_sessions/health_check_broadcaster"
88
88
  end
89
89
 
90
+ module Favicons
91
+ autoload :Discoverer, "source_monitor/favicons/discoverer"
92
+ end
93
+
90
94
  module Images
91
95
  autoload :ContentRewriter, "source_monitor/images/content_rewriter"
92
96
  autoload :Downloader, "source_monitor/images/downloader"
@@ -0,0 +1,11 @@
1
+ # frozen_string_literal: true
2
+
3
+ namespace :test do
4
+ desc "Run tests excluding slow integration and system tests"
5
+ task fast: :environment do
6
+ $stdout.puts "Running tests excluding integration/ and system/ directories..."
7
+ test_files = Dir["test/**/*_test.rb"]
8
+ .reject { |f| f.start_with?("test/integration/", "test/system/") }
9
+ system("bin/rails", "test", *test_files, exception: true)
10
+ end
11
+ end
@@ -23,7 +23,7 @@ Gem::Specification.new do |spec|
23
23
  spec.files = Dir.chdir(File.expand_path(__dir__)) do
24
24
  tracked_files = `git ls-files -z`.split("\x0")
25
25
  tracked_files.reject do |file|
26
- file.start_with?(".ai/", ".github/", "coverage/", "node_modules/", "pkg/", "spec/", "test/", "tmp/", "vendor/", "examples/", "bin/")
26
+ file.start_with?(".ai/", ".github/", ".vbw-planning/", "coverage/", "node_modules/", "pkg/", "spec/", "test/", "tmp/", "vendor/", "examples/", "bin/")
27
27
  end
28
28
  end
29
29
  spec.files += [ "CHANGELOG.md" ].select { |path| File.exist?(File.join(__dir__, path)) }
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: source_monitor
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.7.0
4
+ version: 0.8.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - dchuk
@@ -342,98 +342,6 @@ files:
342
342
  - ".gitignore"
343
343
  - ".rubocop.yml"
344
344
  - ".ruby-version"
345
- - ".vbw-planning/PROJECT.md"
346
- - ".vbw-planning/ROADMAP.md"
347
- - ".vbw-planning/SHIPPED.md"
348
- - ".vbw-planning/STATE.md"
349
- - ".vbw-planning/codebase/ARCHITECTURE.md"
350
- - ".vbw-planning/codebase/CONCERNS.md"
351
- - ".vbw-planning/codebase/CONVENTIONS.md"
352
- - ".vbw-planning/codebase/DEPENDENCIES.md"
353
- - ".vbw-planning/codebase/INDEX.md"
354
- - ".vbw-planning/codebase/META.md"
355
- - ".vbw-planning/codebase/PATTERNS.md"
356
- - ".vbw-planning/codebase/STACK.md"
357
- - ".vbw-planning/codebase/STRUCTURE.md"
358
- - ".vbw-planning/codebase/TESTING.md"
359
- - ".vbw-planning/config.json"
360
- - ".vbw-planning/discovery.json"
361
- - ".vbw-planning/milestones/default/ROADMAP.md"
362
- - ".vbw-planning/milestones/default/STATE.md"
363
- - ".vbw-planning/milestones/default/phases/01-coverage-analysis-quick-wins/PLAN-01-SUMMARY.md"
364
- - ".vbw-planning/milestones/default/phases/01-coverage-analysis-quick-wins/PLAN-01.md"
365
- - ".vbw-planning/milestones/default/phases/01-coverage-analysis-quick-wins/PLAN-02-SUMMARY.md"
366
- - ".vbw-planning/milestones/default/phases/01-coverage-analysis-quick-wins/PLAN-02.md"
367
- - ".vbw-planning/milestones/default/phases/02-critical-path-test-coverage/PLAN-01-SUMMARY.md"
368
- - ".vbw-planning/milestones/default/phases/02-critical-path-test-coverage/PLAN-01.md"
369
- - ".vbw-planning/milestones/default/phases/02-critical-path-test-coverage/PLAN-02-SUMMARY.md"
370
- - ".vbw-planning/milestones/default/phases/02-critical-path-test-coverage/PLAN-02.md"
371
- - ".vbw-planning/milestones/default/phases/02-critical-path-test-coverage/PLAN-03-SUMMARY.md"
372
- - ".vbw-planning/milestones/default/phases/02-critical-path-test-coverage/PLAN-03.md"
373
- - ".vbw-planning/milestones/default/phases/02-critical-path-test-coverage/PLAN-04-SUMMARY.md"
374
- - ".vbw-planning/milestones/default/phases/02-critical-path-test-coverage/PLAN-04.md"
375
- - ".vbw-planning/milestones/default/phases/02-critical-path-test-coverage/PLAN-05-SUMMARY.md"
376
- - ".vbw-planning/milestones/default/phases/02-critical-path-test-coverage/PLAN-05.md"
377
- - ".vbw-planning/milestones/default/phases/03-large-file-refactoring/03-VERIFICATION-wave1.md"
378
- - ".vbw-planning/milestones/default/phases/03-large-file-refactoring/03-VERIFICATION.md"
379
- - ".vbw-planning/milestones/default/phases/03-large-file-refactoring/PLAN-01-SUMMARY.md"
380
- - ".vbw-planning/milestones/default/phases/03-large-file-refactoring/PLAN-01.md"
381
- - ".vbw-planning/milestones/default/phases/03-large-file-refactoring/PLAN-02-SUMMARY.md"
382
- - ".vbw-planning/milestones/default/phases/03-large-file-refactoring/PLAN-02.md"
383
- - ".vbw-planning/milestones/default/phases/03-large-file-refactoring/PLAN-03-SUMMARY.md"
384
- - ".vbw-planning/milestones/default/phases/03-large-file-refactoring/PLAN-03.md"
385
- - ".vbw-planning/milestones/default/phases/03-large-file-refactoring/PLAN-04-SUMMARY.md"
386
- - ".vbw-planning/milestones/default/phases/03-large-file-refactoring/PLAN-04.md"
387
- - ".vbw-planning/milestones/default/phases/04-code-quality-conventions-cleanup/04-CONTEXT.md"
388
- - ".vbw-planning/milestones/default/phases/04-code-quality-conventions-cleanup/PLAN-01-SUMMARY.md"
389
- - ".vbw-planning/milestones/default/phases/04-code-quality-conventions-cleanup/PLAN-01.md"
390
- - ".vbw-planning/milestones/default/phases/04-code-quality-conventions-cleanup/PLAN-02-SUMMARY.md"
391
- - ".vbw-planning/milestones/default/phases/04-code-quality-conventions-cleanup/PLAN-02.md"
392
- - ".vbw-planning/milestones/default/phases/04-code-quality-conventions-cleanup/PLAN-03-SUMMARY.md"
393
- - ".vbw-planning/milestones/default/phases/04-code-quality-conventions-cleanup/PLAN-03.md"
394
- - ".vbw-planning/milestones/generator-enhancements/REQUIREMENTS.md"
395
- - ".vbw-planning/milestones/generator-enhancements/ROADMAP.md"
396
- - ".vbw-planning/milestones/generator-enhancements/SHIPPED.md"
397
- - ".vbw-planning/milestones/generator-enhancements/STATE.md"
398
- - ".vbw-planning/milestones/generator-enhancements/phases/01-generator-steps/01-CONTEXT.md"
399
- - ".vbw-planning/milestones/generator-enhancements/phases/01-generator-steps/01-VERIFICATION.md"
400
- - ".vbw-planning/milestones/generator-enhancements/phases/01-generator-steps/PLAN-01-SUMMARY.md"
401
- - ".vbw-planning/milestones/generator-enhancements/phases/01-generator-steps/PLAN-01.md"
402
- - ".vbw-planning/milestones/generator-enhancements/phases/02-verification/02-VERIFICATION.md"
403
- - ".vbw-planning/milestones/generator-enhancements/phases/02-verification/PLAN-01-SUMMARY.md"
404
- - ".vbw-planning/milestones/generator-enhancements/phases/02-verification/PLAN-01.md"
405
- - ".vbw-planning/milestones/generator-enhancements/phases/03-docs-alignment/03-VERIFICATION.md"
406
- - ".vbw-planning/milestones/generator-enhancements/phases/03-docs-alignment/PLAN-01-SUMMARY.md"
407
- - ".vbw-planning/milestones/generator-enhancements/phases/03-docs-alignment/PLAN-01.md"
408
- - ".vbw-planning/milestones/generator-enhancements/phases/04-dashboard-ux/04-VERIFICATION.md"
409
- - ".vbw-planning/milestones/generator-enhancements/phases/04-dashboard-ux/PLAN-01-SUMMARY.md"
410
- - ".vbw-planning/milestones/generator-enhancements/phases/04-dashboard-ux/PLAN-01.md"
411
- - ".vbw-planning/milestones/generator-enhancements/phases/05-active-storage-images/05-VERIFICATION.md"
412
- - ".vbw-planning/milestones/generator-enhancements/phases/05-active-storage-images/PLAN-01-SUMMARY.md"
413
- - ".vbw-planning/milestones/generator-enhancements/phases/05-active-storage-images/PLAN-01.md"
414
- - ".vbw-planning/milestones/generator-enhancements/phases/05-active-storage-images/PLAN-02-SUMMARY.md"
415
- - ".vbw-planning/milestones/generator-enhancements/phases/05-active-storage-images/PLAN-02.md"
416
- - ".vbw-planning/milestones/generator-enhancements/phases/06-netflix-feed-fix/06-VERIFICATION.md"
417
- - ".vbw-planning/milestones/generator-enhancements/phases/06-netflix-feed-fix/PLAN-01-SUMMARY.md"
418
- - ".vbw-planning/milestones/generator-enhancements/phases/06-netflix-feed-fix/PLAN-01.md"
419
- - ".vbw-planning/milestones/upgrade-assurance/REQUIREMENTS.md"
420
- - ".vbw-planning/milestones/upgrade-assurance/ROADMAP.md"
421
- - ".vbw-planning/milestones/upgrade-assurance/STATE.md"
422
- - ".vbw-planning/milestones/upgrade-assurance/phases/01-upgrade-command/01-VERIFICATION.md"
423
- - ".vbw-planning/milestones/upgrade-assurance/phases/01-upgrade-command/PLAN-01-SUMMARY.md"
424
- - ".vbw-planning/milestones/upgrade-assurance/phases/01-upgrade-command/PLAN-01.md"
425
- - ".vbw-planning/milestones/upgrade-assurance/phases/02-config-deprecation/PLAN-01-SUMMARY.md"
426
- - ".vbw-planning/milestones/upgrade-assurance/phases/02-config-deprecation/PLAN-01.md"
427
- - ".vbw-planning/milestones/upgrade-assurance/phases/03-upgrade-skill-docs/03-VERIFICATION.md"
428
- - ".vbw-planning/milestones/upgrade-assurance/phases/03-upgrade-skill-docs/PLAN-01-SUMMARY.md"
429
- - ".vbw-planning/milestones/upgrade-assurance/phases/03-upgrade-skill-docs/PLAN-01.md"
430
- - ".vbw-planning/phases/01-aia-certificate-resolution/.context-dev.md"
431
- - ".vbw-planning/phases/01-aia-certificate-resolution/PLAN-01-SUMMARY.md"
432
- - ".vbw-planning/phases/01-aia-certificate-resolution/PLAN-01.md"
433
- - ".vbw-planning/phases/01-aia-certificate-resolution/PLAN-02-SUMMARY.md"
434
- - ".vbw-planning/phases/01-aia-certificate-resolution/PLAN-02.md"
435
- - ".vbw-planning/phases/01-aia-certificate-resolution/PLAN-03-SUMMARY.md"
436
- - ".vbw-planning/phases/01-aia-certificate-resolution/PLAN-03.md"
437
345
  - AGENTS.md
438
346
  - CHANGELOG.md
439
347
  - CLAUDE.md
@@ -455,6 +363,7 @@ files:
455
363
  - app/assets/javascripts/source_monitor/controllers/confirm_navigation_controller.js
456
364
  - app/assets/javascripts/source_monitor/controllers/dropdown_controller.js
457
365
  - app/assets/javascripts/source_monitor/controllers/modal_controller.js
366
+ - app/assets/javascripts/source_monitor/controllers/notification_container_controller.js
458
367
  - app/assets/javascripts/source_monitor/controllers/notification_controller.js
459
368
  - app/assets/javascripts/source_monitor/controllers/select_all_controller.js
460
369
  - app/assets/javascripts/source_monitor/turbo_actions.js
@@ -475,6 +384,7 @@ files:
475
384
  - app/controllers/source_monitor/logs_controller.rb
476
385
  - app/controllers/source_monitor/scrape_logs_controller.rb
477
386
  - app/controllers/source_monitor/source_bulk_scrapes_controller.rb
387
+ - app/controllers/source_monitor/source_favicon_fetches_controller.rb
478
388
  - app/controllers/source_monitor/source_fetches_controller.rb
479
389
  - app/controllers/source_monitor/source_health_checks_controller.rb
480
390
  - app/controllers/source_monitor/source_health_resets_controller.rb
@@ -486,6 +396,7 @@ files:
486
396
  - app/helpers/source_monitor/table_sort_helper.rb
487
397
  - app/jobs/source_monitor/application_job.rb
488
398
  - app/jobs/source_monitor/download_content_images_job.rb
399
+ - app/jobs/source_monitor/favicon_fetch_job.rb
489
400
  - app/jobs/source_monitor/fetch_feed_job.rb
490
401
  - app/jobs/source_monitor/import_opml_job.rb
491
402
  - app/jobs/source_monitor/import_session_health_check_job.rb
@@ -599,6 +510,7 @@ files:
599
510
  - lib/source_monitor/configuration/authentication_settings.rb
600
511
  - lib/source_monitor/configuration/deprecation_registry.rb
601
512
  - lib/source_monitor/configuration/events.rb
513
+ - lib/source_monitor/configuration/favicons_settings.rb
602
514
  - lib/source_monitor/configuration/fetching_settings.rb
603
515
  - lib/source_monitor/configuration/health_settings.rb
604
516
  - lib/source_monitor/configuration/http_settings.rb
@@ -621,6 +533,7 @@ files:
621
533
  - lib/source_monitor/dashboard/upcoming_fetch_schedule.rb
622
534
  - lib/source_monitor/engine.rb
623
535
  - lib/source_monitor/events.rb
536
+ - lib/source_monitor/favicons/discoverer.rb
624
537
  - lib/source_monitor/feedjira_extensions.rb
625
538
  - lib/source_monitor/fetching/advisory_lock.rb
626
539
  - lib/source_monitor/fetching/completion/event_publisher.rb
@@ -720,6 +633,7 @@ files:
720
633
  - lib/tasks/source_monitor_assets.rake
721
634
  - lib/tasks/source_monitor_setup.rake
722
635
  - lib/tasks/source_monitor_tasks.rake
636
+ - lib/tasks/test_fast.rake
723
637
  - lib/tasks/test_smoke.rake
724
638
  - package-lock.json
725
639
  - package.json
@@ -1,51 +0,0 @@
1
- <!-- VBW PROJECT TEMPLATE (ARTF-04) -- Human-facing project definition -->
2
- <!-- Created by /vbw init, maintained by Architect agent -->
3
-
4
- # SourceMonitor
5
-
6
- ## What This Is
7
-
8
- SourceMonitor is a mountable Rails 8 engine for ingesting RSS/Atom/JSON feeds, scraping article content via pluggable adapters, and providing Solid Queue-powered dashboards for monitoring and remediation. It is distributed as a RubyGem and integrates with host Rails applications.
9
-
10
- ## Core Value
11
-
12
- A drop-in Rails engine that gives any Rails application feed monitoring, content scraping, and operational dashboards without building the plumbing from scratch.
13
-
14
- ## Requirements
15
-
16
- ### Validated
17
-
18
- None yet.
19
-
20
- ### Active
21
-
22
- - [ ] Close test coverage gaps identified in the coverage baseline
23
- - [ ] Refactor large files for maintainability and single-responsibility
24
- - [ ] Ensure codebase follows Rails best practices and conventions throughout
25
-
26
- ### Out of Scope
27
-
28
- - Multi-database support (MySQL/SQLite) -- Keep PostgreSQL-only for now
29
- - Built-in authentication -- Continue relying on host app for auth
30
-
31
- ## Context
32
-
33
- This is a brownfield Rails engine at v0.2.1 with 530 source files (325 Ruby, 48 ERB). The codebase has 130 test files, CI/CD via GitHub Actions, and a coverage baseline tracking 2329 lines of uncovered code. Key technical debt includes large files (FeedFetcher 627 lines, Configuration 655 lines, ImportSessionsController 792 lines) and coverage gaps in critical paths.
34
-
35
- ## Constraints
36
-
37
- - **Ruby**: >= 3.4.0
38
- - **Rails**: >= 8.0.3, < 9.0
39
- - **Database**: PostgreSQL only
40
- - **Testing**: Minitest (not RSpec), branch coverage via SimpleCov
41
-
42
- ## Key Decisions
43
-
44
- | Decision | Rationale | Outcome |
45
- |----------|-----------|---------|
46
- | Focus on coverage + refactoring before new features | Stabilize existing code before adding complexity | Pending |
47
- | Keep PostgreSQL-only | Not worth the complexity of multi-DB support at this stage | Confirmed |
48
- | Keep host-app auth | Engine should be composable, not opinionated about auth | Confirmed |
49
-
50
- ---
51
- *Last updated: 2026-02-09 after VBW bootstrap*
@@ -1,32 +0,0 @@
1
- # Roadmap
2
-
3
- ## Milestone: aia-ssl-fix
4
-
5
- ### Phases
6
-
7
- 1. [x] **AIA Certificate Resolution** -- Fix SSL failures for feeds with missing intermediate certificates by implementing AIA (Authority Information Access) resolution
8
-
9
- ### Phase Details
10
-
11
- #### Phase 1: AIA Certificate Resolution
12
-
13
- **Goal:** Implement automatic AIA intermediate certificate fetching so feeds like netflixtechblog.com (served via Medium/AWS with wrong intermediates) succeed without manual cert configuration.
14
-
15
- **Requirements:**
16
- - REQ-AIA-01: Create AIAResolver module with thread-safe cache and 1-hour TTL
17
- - REQ-AIA-02: Add cert_store: parameter to HTTP.client for custom cert stores
18
- - REQ-AIA-03: On Faraday::SSLError, attempt AIA resolution before failing
19
- - REQ-AIA-04: Best-effort only -- never make things worse (rescue StandardError -> nil)
20
-
21
- **Success Criteria:**
22
- - [ ] AIAResolver.resolve(hostname) fetches leaf cert, extracts AIA URL, downloads intermediate
23
- - [ ] HTTP.client(cert_store:) accepts and uses custom cert stores
24
- - [ ] FeedFetcher retries once with AIA-resolved cert store on SSL failure
25
- - [ ] All existing tests pass (1003+), new tests cover AIA paths
26
- - [ ] RuboCop zero offenses, Brakeman zero warnings
27
-
28
- ### Progress
29
-
30
- | Phase | Status | Plans | Completed |
31
- |-------|--------|-------|-----------|
32
- | 1. AIA Certificate Resolution | Planned | 3 | 0 |