source_monitor 0.10.2 → 0.11.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/.claude/agent-memory/vbw-vbw-debugger/MEMORY.md +15 -0
- data/.claude/skills/sm-configuration-setting/reference/settings-catalog.md +3 -3
- data/.claude/skills/sm-configure/reference/configuration-reference.md +3 -3
- data/.claude/skills/sm-domain-model/SKILL.md +2 -2
- data/.claude/skills/sm-domain-model/reference/table-structure.md +3 -1
- data/.claude/skills/sm-engine-migration/SKILL.md +1 -1
- data/.claude/skills/sm-engine-migration/reference/migration-conventions.md +1 -1
- data/.claude/skills/sm-health-rule/SKILL.md +18 -21
- data/.claude/skills/sm-health-rule/reference/health-system.md +1 -1
- data/.claude/skills/sm-host-setup/reference/initializer-template.md +2 -2
- data/.claude/skills/sm-upgrade/reference/version-history.md +17 -12
- data/CHANGELOG.md +42 -0
- data/CLAUDE.md +2 -2
- data/Gemfile +1 -0
- data/Gemfile.lock +4 -1
- data/README.md +3 -3
- data/VERSION +1 -1
- data/app/assets/builds/source_monitor/application.css +132 -12
- data/app/assets/builds/source_monitor/application.js +25 -1
- data/app/assets/builds/source_monitor/application.js.map +2 -2
- data/app/assets/javascripts/source_monitor/controllers/modal_controller.js +8 -0
- data/app/assets/javascripts/source_monitor/controllers/select_all_controller.js +22 -2
- data/app/assets/stylesheets/source_monitor/application.tailwind.css +1 -1
- data/app/controllers/source_monitor/bulk_scrape_enablements_controller.rb +57 -0
- data/app/controllers/source_monitor/dashboard_controller.rb +10 -1
- data/app/controllers/source_monitor/import_history_dismissals_controller.rb +20 -0
- data/app/controllers/source_monitor/source_retries_controller.rb +10 -2
- data/app/controllers/source_monitor/source_scrape_tests_controller.rb +73 -0
- data/app/controllers/source_monitor/sources_controller.rb +51 -9
- data/app/helpers/source_monitor/application_helper.rb +24 -0
- data/app/helpers/source_monitor/health_badge_helper.rb +7 -20
- data/app/jobs/source_monitor/fetch_feed_job.rb +32 -3
- data/app/jobs/source_monitor/source_health_check_job.rb +1 -1
- data/app/models/source_monitor/fetch_log.rb +4 -0
- data/app/models/source_monitor/import_history.rb +2 -0
- data/app/models/source_monitor/source.rb +47 -2
- data/app/views/source_monitor/dashboard/_fetch_schedule.html.erb +94 -68
- data/app/views/source_monitor/dashboard/_scrape_recommendations.html.erb +17 -0
- data/app/views/source_monitor/dashboard/_stats.html.erb +19 -0
- data/app/views/source_monitor/dashboard/index.html.erb +7 -1
- data/app/views/source_monitor/import_sessions/health_check/_row.html.erb +2 -2
- data/app/views/source_monitor/shared/_pagination.html.erb +74 -0
- data/app/views/source_monitor/source_scrape_tests/_result.html.erb +81 -0
- data/app/views/source_monitor/source_scrape_tests/show.html.erb +60 -0
- data/app/views/source_monitor/sources/_bulk_scrape_enable_modal.html.erb +29 -0
- data/app/views/source_monitor/sources/_details.html.erb +19 -1
- data/app/views/source_monitor/sources/_empty_state_row.html.erb +1 -1
- data/app/views/source_monitor/sources/_import_history_panel.html.erb +12 -5
- data/app/views/source_monitor/sources/_row.html.erb +34 -6
- data/app/views/source_monitor/sources/index.html.erb +184 -132
- data/config/brakeman.ignore +11 -1
- data/config/routes.rb +5 -0
- data/db/migrate/20260305120000_add_dismissed_at_to_import_histories.rb +7 -0
- data/db/migrate/20260306233004_add_error_category_to_fetch_logs.rb +8 -0
- data/db/migrate/20260307120000_add_consecutive_fetch_failures_to_sources.rb +11 -0
- data/db/migrate/20260312120000_simplify_health_status_values.rb +20 -0
- data/docs/configuration.md +9 -1
- data/docs/troubleshooting.md +9 -0
- data/docs/upgrade.md +31 -0
- data/lib/generators/source_monitor/install/templates/source_monitor.rb.tt +2 -3
- data/lib/source_monitor/analytics/scrape_recommendations.rb +27 -0
- data/lib/source_monitor/configuration/health_settings.rb +0 -2
- data/lib/source_monitor/configuration/scraping_settings.rb +8 -1
- data/lib/source_monitor/dashboard/queries/stats_query.rb +12 -1
- data/lib/source_monitor/dashboard/queries.rb +6 -3
- data/lib/source_monitor/dashboard/recent_activity_presenter.rb +6 -5
- data/lib/source_monitor/dashboard/upcoming_fetch_schedule.rb +40 -54
- data/lib/source_monitor/favicons/discoverer.rb +16 -0
- data/lib/source_monitor/favicons/svg_converter.rb +60 -0
- data/lib/source_monitor/fetching/cloudflare_bypass.rb +79 -0
- data/lib/source_monitor/fetching/feed_fetcher/source_updater.rb +82 -2
- data/lib/source_monitor/fetching/feed_fetcher.rb +55 -1
- data/lib/source_monitor/fetching/fetch_error.rb +27 -0
- data/lib/source_monitor/fetching/fetch_runner.rb +4 -0
- data/lib/source_monitor/fetching/retry_policy.rb +4 -0
- data/lib/source_monitor/health/import_source_health_check.rb +3 -3
- data/lib/source_monitor/health/source_health_monitor.rb +9 -14
- data/lib/source_monitor/health/source_health_reset.rb +1 -1
- data/lib/source_monitor/pagination/paginator.rb +18 -1
- data/lib/source_monitor/version.rb +1 -1
- data/lib/source_monitor.rb +3 -0
- metadata +17 -1
|
@@ -1,6 +1,7 @@
|
|
|
1
1
|
<% rate_map = local_assigns[:item_activity_rates] || {} %>
|
|
2
2
|
<% avg_feed_words_map = local_assigns[:avg_feed_word_counts] || {} %>
|
|
3
3
|
<% avg_scraped_words_map = local_assigns[:avg_scraped_word_counts] || {} %>
|
|
4
|
+
<% scrape_candidates = local_assigns[:scrape_candidate_ids] || Set.new %>
|
|
4
5
|
<% activity_rate = rate_map.fetch(source.id, 0.0) %>
|
|
5
6
|
<% health_status_override = local_assigns[:health_status_override] %>
|
|
6
7
|
<% health_status = if !source.active?
|
|
@@ -24,22 +25,48 @@
|
|
|
24
25
|
<% delete_path = delete_query.present? ? source_monitor.source_path(source, q: delete_query) : source_monitor.source_path(source) %>
|
|
25
26
|
|
|
26
27
|
<tr id="<%= dom_id(source, :row) %>" class="hover:bg-slate-50">
|
|
28
|
+
<td class="w-10 px-3 py-4">
|
|
29
|
+
<% if scrape_candidates.include?(source.id) %>
|
|
30
|
+
<input type="checkbox"
|
|
31
|
+
name="bulk_scrape_enablement[source_ids][]"
|
|
32
|
+
value="<%= source.id %>"
|
|
33
|
+
data-select-all-target="item"
|
|
34
|
+
data-action="select-all#toggleItem"
|
|
35
|
+
class="rounded border-slate-300 text-violet-600 focus:ring-violet-500"
|
|
36
|
+
aria-label="Select <%= source.name %>">
|
|
37
|
+
<% end %>
|
|
38
|
+
</td>
|
|
27
39
|
<td class="px-6 py-4">
|
|
28
40
|
<div class="flex items-center gap-3">
|
|
29
41
|
<%= source_favicon_tag(source, size: 24) %>
|
|
30
42
|
<div>
|
|
31
|
-
<div class="font-medium text-slate-900">
|
|
43
|
+
<div class="flex items-center gap-2 font-medium text-slate-900">
|
|
32
44
|
<%= link_to source.name,
|
|
33
45
|
source_monitor.source_path(source),
|
|
34
46
|
class: "text-slate-900 hover:text-blue-600 focus:outline-none focus:ring-2 focus:ring-blue-500 focus:ring-offset-2",
|
|
35
47
|
data: { turbo_frame: "_top" } %>
|
|
48
|
+
<% if source.last_error.to_s.match?(/\bblocked\b/i) %>
|
|
49
|
+
<% blocked_detail = source.last_error.to_s.match(/blocked by (\w+)/i) %>
|
|
50
|
+
<span class="inline-flex items-center rounded-full bg-rose-100 px-2 py-0.5 text-[10px] font-semibold text-rose-700"
|
|
51
|
+
title="<%= blocked_detail ? "#{blocked_detail[1].capitalize} Blocked" : "Blocked" %>"
|
|
52
|
+
data-testid="source-blocked-badge">
|
|
53
|
+
Blocked
|
|
54
|
+
</span>
|
|
55
|
+
<% end %>
|
|
36
56
|
</div>
|
|
37
57
|
<div class="text-xs text-slate-500 truncate max-w-xs"><%= external_link_to source.feed_url, source.feed_url, class: "text-slate-500 hover:text-blue-500" %></div>
|
|
58
|
+
<% if scrape_candidates.include?(source.id) %>
|
|
59
|
+
<span class="mt-1 inline-flex items-center rounded-full bg-violet-100 px-2 py-0.5 text-[10px] font-semibold text-violet-700"
|
|
60
|
+
title="Low feed word count — consider enabling scraping"
|
|
61
|
+
data-testid="scrape-recommendation-badge">
|
|
62
|
+
Scrape Recommended
|
|
63
|
+
</span>
|
|
64
|
+
<% end %>
|
|
38
65
|
</div>
|
|
39
66
|
</div>
|
|
40
67
|
</td>
|
|
41
68
|
<td class="px-6 py-4">
|
|
42
|
-
<div class="flex flex-col gap-
|
|
69
|
+
<div class="flex flex-col gap-1 text-xs">
|
|
43
70
|
<% if source.active? %>
|
|
44
71
|
<%= render "source_monitor/sources/health_status_badge",
|
|
45
72
|
source: source,
|
|
@@ -50,6 +77,10 @@
|
|
|
50
77
|
<% if source.rolling_success_rate.present? %>
|
|
51
78
|
<span class="text-[11px] text-slate-500">Success Rate: <%= number_to_percentage(source.rolling_success_rate * 100, precision: 0) %></span>
|
|
52
79
|
<% end %>
|
|
80
|
+
</div>
|
|
81
|
+
</td>
|
|
82
|
+
<td class="px-6 py-4">
|
|
83
|
+
<div class="flex flex-col gap-1 text-xs">
|
|
53
84
|
<span class="inline-flex items-center rounded-full px-3 py-1 font-semibold <%= fetch_status[:classes] %>">
|
|
54
85
|
<%= loading_spinner_svg(css_class: "mr-1 h-3.5 w-3.5 animate-spin text-blue-500") if fetch_status[:show_spinner] %>
|
|
55
86
|
<%= fetch_status[:label] %>
|
|
@@ -57,12 +88,9 @@
|
|
|
57
88
|
<span class="ml-2 font-normal text-[10px] text-slate-500">(since <%= source.last_fetch_started_at.strftime("%H:%M:%S") %>)</span>
|
|
58
89
|
<% end %>
|
|
59
90
|
</span>
|
|
91
|
+
<span class="text-[11px] text-slate-500">(<%= number_with_precision(1440.0 / source.fetch_interval_minutes, precision: 1) %>x / day)</span>
|
|
60
92
|
</div>
|
|
61
93
|
</td>
|
|
62
|
-
<td class="px-6 py-4 text-sm">
|
|
63
|
-
<%= "#{source.fetch_interval_minutes} min" %>
|
|
64
|
-
<span class="text-xs text-slate-500">(~<%= number_with_precision(source.fetch_interval_minutes / 60.0, precision: 2) %> h)</span>
|
|
65
|
-
</td>
|
|
66
94
|
<td class="px-6 py-4 text-sm"><%= source.items_count %></td>
|
|
67
95
|
<td class="px-6 py-4 text-sm">
|
|
68
96
|
<%= number_with_precision(activity_rate, precision: 2) %>
|
|
@@ -12,6 +12,13 @@
|
|
|
12
12
|
</div>
|
|
13
13
|
</div>
|
|
14
14
|
|
|
15
|
+
<%= render "source_monitor/sources/import_history_panel", import_histories: @recent_import_histories %>
|
|
16
|
+
|
|
17
|
+
<%= render "source_monitor/sources/fetch_interval_heatmap",
|
|
18
|
+
fetch_interval_distribution: @fetch_interval_distribution,
|
|
19
|
+
selected_bucket: @selected_fetch_interval_bucket,
|
|
20
|
+
search_params: @search_params %>
|
|
21
|
+
|
|
15
22
|
<%= search_form_for @q, url: source_monitor.sources_path, method: :get, html: { class: "flex flex-wrap items-end gap-3", data: { turbo_frame: "source_monitor_sources_table" } } do |form| %>
|
|
16
23
|
<div class="flex-1 min-w-[12rem]">
|
|
17
24
|
<%= form.label @search_field, "Search sources", class: "sr-only" %>
|
|
@@ -27,7 +34,7 @@
|
|
|
27
34
|
</div>
|
|
28
35
|
<div>
|
|
29
36
|
<%= form.label :health_status_eq, "Health", class: "block text-xs font-medium text-slate-500 mb-1" %>
|
|
30
|
-
<%= form.select :health_status_eq, options_for_select([["All Health", ""], ["
|
|
37
|
+
<%= form.select :health_status_eq, options_for_select([["All Health", ""], ["Working", "working"], ["Declining", "declining"], ["Improving", "improving"], ["Failing", "failing"]], @search_params["health_status_eq"].to_s), {}, class: "rounded-md border border-slate-200 bg-white px-2 py-2 text-sm text-slate-700 focus:border-blue-500 focus:outline-none focus:ring-1 focus:ring-blue-500", onchange: "this.form.requestSubmit()" %>
|
|
31
38
|
</div>
|
|
32
39
|
<div>
|
|
33
40
|
<%= form.label :feed_format_eq, "Format", class: "block text-xs font-medium text-slate-500 mb-1" %>
|
|
@@ -38,19 +45,16 @@
|
|
|
38
45
|
<%= form.label :scraper_adapter_eq, "Adapter", class: "block text-xs font-medium text-slate-500 mb-1" %>
|
|
39
46
|
<%= form.select :scraper_adapter_eq, options_for_select([["All Adapters", ""]] + adapter_options.map { |a| [a.titleize, a] }, @search_params["scraper_adapter_eq"].to_s), {}, class: "rounded-md border border-slate-200 bg-white px-2 py-2 text-sm text-slate-700 focus:border-blue-500 focus:outline-none focus:ring-1 focus:ring-blue-500", onchange: "this.form.requestSubmit()" %>
|
|
40
47
|
</div>
|
|
48
|
+
<div>
|
|
49
|
+
<%= form.label :scraping_enabled_eq, "Scrape", class: "block text-xs font-medium text-slate-500 mb-1" %>
|
|
50
|
+
<%= form.select :scraping_enabled_eq, options_for_select([["All Sources", ""], ["Scraping Enabled", "true"], ["Scraping Disabled", "false"], ["Recommendations", "recommend"]], @search_params["scraping_enabled_eq"].to_s), {}, class: "rounded-md border border-slate-200 bg-white px-2 py-2 text-sm text-slate-700 focus:border-blue-500 focus:outline-none focus:ring-1 focus:ring-blue-500", onchange: "this.form.requestSubmit()" %>
|
|
51
|
+
</div>
|
|
41
52
|
</div>
|
|
42
53
|
<% end %>
|
|
43
54
|
|
|
44
|
-
<%= render "source_monitor/sources/import_history_panel", import_histories: @recent_import_histories %>
|
|
45
|
-
|
|
46
|
-
<%= render "source_monitor/sources/fetch_interval_heatmap",
|
|
47
|
-
fetch_interval_distribution: @fetch_interval_distribution,
|
|
48
|
-
selected_bucket: @selected_fetch_interval_bucket,
|
|
49
|
-
search_params: @search_params %>
|
|
50
|
-
|
|
51
55
|
<div class="overflow-hidden rounded-lg border border-slate-200 bg-white shadow-sm">
|
|
52
56
|
<%= turbo_frame_tag "source_monitor_sources_table" do %>
|
|
53
|
-
<% dropdown_filter_keys = %w[active_eq health_status_eq feed_format_eq scraper_adapter_eq] %>
|
|
57
|
+
<% dropdown_filter_keys = %w[active_eq health_status_eq feed_format_eq scraper_adapter_eq scraping_enabled_eq avg_feed_words_lt] %>
|
|
54
58
|
<% active_dropdown_filters = dropdown_filter_keys.select { |k| @search_params[k].present? } %>
|
|
55
59
|
<% has_any_filter = @search_term.present? || @fetch_interval_filter.present? || active_dropdown_filters.any? %>
|
|
56
60
|
<% if has_any_filter %>
|
|
@@ -87,7 +91,9 @@
|
|
|
87
91
|
"active_eq" => @search_params["active_eq"] == "true" ? "Status: Active" : "Status: Paused",
|
|
88
92
|
"health_status_eq" => "Health: #{@search_params['health_status_eq']&.titleize}",
|
|
89
93
|
"feed_format_eq" => "Format: #{@search_params['feed_format_eq']&.upcase}",
|
|
90
|
-
"scraper_adapter_eq" => "Adapter: #{@search_params['scraper_adapter_eq']&.titleize}"
|
|
94
|
+
"scraper_adapter_eq" => "Adapter: #{@search_params['scraper_adapter_eq']&.titleize}",
|
|
95
|
+
"scraping_enabled_eq" => @search_params["scraping_enabled_eq"] == "true" ? "Scraping: Enabled" : "Scraping: Disabled",
|
|
96
|
+
"avg_feed_words_lt" => "Avg Feed Words: < #{@search_params['avg_feed_words_lt']}"
|
|
91
97
|
} %>
|
|
92
98
|
<% active_dropdown_filters.each do |filter_key| %>
|
|
93
99
|
<% clear_query = @search_params.dup %>
|
|
@@ -107,131 +113,177 @@
|
|
|
107
113
|
<% end %>
|
|
108
114
|
</div>
|
|
109
115
|
<% end %>
|
|
110
|
-
|
|
111
|
-
<
|
|
112
|
-
<
|
|
113
|
-
<
|
|
114
|
-
|
|
115
|
-
|
|
116
|
-
|
|
117
|
-
|
|
118
|
-
|
|
119
|
-
|
|
120
|
-
|
|
121
|
-
|
|
122
|
-
|
|
123
|
-
|
|
124
|
-
|
|
125
|
-
|
|
126
|
-
|
|
127
|
-
|
|
128
|
-
|
|
129
|
-
|
|
130
|
-
|
|
131
|
-
|
|
132
|
-
|
|
133
|
-
|
|
134
|
-
|
|
135
|
-
|
|
136
|
-
|
|
137
|
-
|
|
138
|
-
|
|
139
|
-
|
|
140
|
-
|
|
141
|
-
|
|
142
|
-
|
|
143
|
-
|
|
144
|
-
|
|
145
|
-
|
|
146
|
-
|
|
147
|
-
|
|
148
|
-
|
|
149
|
-
|
|
150
|
-
|
|
151
|
-
|
|
152
|
-
|
|
153
|
-
|
|
154
|
-
|
|
155
|
-
|
|
156
|
-
|
|
157
|
-
|
|
158
|
-
|
|
159
|
-
|
|
160
|
-
|
|
161
|
-
|
|
162
|
-
|
|
163
|
-
|
|
164
|
-
|
|
165
|
-
|
|
166
|
-
|
|
167
|
-
|
|
168
|
-
|
|
169
|
-
|
|
170
|
-
|
|
171
|
-
|
|
172
|
-
|
|
173
|
-
|
|
174
|
-
|
|
175
|
-
|
|
176
|
-
|
|
177
|
-
|
|
178
|
-
|
|
179
|
-
|
|
180
|
-
|
|
181
|
-
|
|
182
|
-
|
|
183
|
-
|
|
184
|
-
|
|
185
|
-
|
|
186
|
-
|
|
187
|
-
|
|
188
|
-
|
|
189
|
-
|
|
190
|
-
|
|
191
|
-
|
|
192
|
-
|
|
193
|
-
|
|
194
|
-
|
|
195
|
-
|
|
196
|
-
|
|
197
|
-
|
|
198
|
-
|
|
199
|
-
|
|
200
|
-
|
|
201
|
-
|
|
202
|
-
|
|
203
|
-
|
|
204
|
-
|
|
205
|
-
|
|
206
|
-
|
|
207
|
-
|
|
208
|
-
|
|
209
|
-
|
|
210
|
-
|
|
211
|
-
|
|
212
|
-
|
|
213
|
-
|
|
214
|
-
|
|
215
|
-
|
|
216
|
-
|
|
217
|
-
|
|
218
|
-
|
|
219
|
-
|
|
220
|
-
|
|
116
|
+
<%= form_with url: source_monitor.bulk_scrape_enablements_path, data: { controller: "select-all" } do |form| %>
|
|
117
|
+
<table class="min-w-full divide-y divide-slate-200 text-left text-sm">
|
|
118
|
+
<thead class="bg-slate-50 text-xs font-semibold uppercase tracking-wide text-slate-500">
|
|
119
|
+
<tr>
|
|
120
|
+
<th scope="col" class="w-10 px-3 py-3">
|
|
121
|
+
<% if @scrape_candidate_ids.any? %>
|
|
122
|
+
<input type="checkbox"
|
|
123
|
+
data-select-all-target="master"
|
|
124
|
+
data-action="select-all#toggleAll"
|
|
125
|
+
class="rounded border-slate-300 text-violet-600 focus:ring-violet-500"
|
|
126
|
+
aria-label="Select all sources">
|
|
127
|
+
<% end %>
|
|
128
|
+
</th>
|
|
129
|
+
<th scope="col"
|
|
130
|
+
class="px-6 py-3"
|
|
131
|
+
data-sort-column="name"
|
|
132
|
+
aria-sort="<%= table_sort_aria(@q, :name) %>">
|
|
133
|
+
<span class="inline-flex items-center gap-1">
|
|
134
|
+
<%= table_sort_link(
|
|
135
|
+
@q,
|
|
136
|
+
:name,
|
|
137
|
+
"Source",
|
|
138
|
+
frame: "source_monitor_sources_table",
|
|
139
|
+
default_order: :asc,
|
|
140
|
+
secondary: ["created_at desc"],
|
|
141
|
+
html_options: {
|
|
142
|
+
class: "inline-flex items-center gap-1 text-xs font-semibold uppercase tracking-wide text-slate-600 hover:text-slate-900 focus:outline-none"
|
|
143
|
+
}
|
|
144
|
+
) %>
|
|
145
|
+
<span class="text-[11px] text-slate-400" aria-hidden="true"><%= table_sort_arrow(@q, :name) %></span>
|
|
146
|
+
</span>
|
|
147
|
+
</th>
|
|
148
|
+
<th scope="col" class="px-6 py-3">Health</th>
|
|
149
|
+
<th scope="col" class="px-6 py-3">Fetch Status</th>
|
|
150
|
+
<th scope="col"
|
|
151
|
+
class="px-6 py-3"
|
|
152
|
+
data-sort-column="items_count"
|
|
153
|
+
aria-sort="<%= table_sort_aria(@q, :items_count) %>">
|
|
154
|
+
<span class="inline-flex items-center gap-1">
|
|
155
|
+
<%= table_sort_link(
|
|
156
|
+
@q,
|
|
157
|
+
:items_count,
|
|
158
|
+
"Items",
|
|
159
|
+
frame: "source_monitor_sources_table",
|
|
160
|
+
default_order: :desc,
|
|
161
|
+
secondary: ["created_at desc"],
|
|
162
|
+
html_options: {
|
|
163
|
+
class: "inline-flex items-center gap-1 text-xs font-semibold uppercase tracking-wide text-slate-600 hover:text-slate-900 focus:outline-none"
|
|
164
|
+
}
|
|
165
|
+
) %>
|
|
166
|
+
<span class="text-[11px] text-slate-400" aria-hidden="true"><%= table_sort_arrow(@q, :items_count) %></span>
|
|
167
|
+
</span>
|
|
168
|
+
</th>
|
|
169
|
+
<th scope="col"
|
|
170
|
+
class="px-6 py-3"
|
|
171
|
+
data-sort-column="new_items_per_day"
|
|
172
|
+
aria-sort="<%= table_sort_aria(@q, :new_items_per_day) %>">
|
|
173
|
+
<span class="inline-flex items-center gap-1">
|
|
174
|
+
<%= table_sort_link(
|
|
175
|
+
@q,
|
|
176
|
+
:new_items_per_day,
|
|
177
|
+
"New Items / Day",
|
|
178
|
+
frame: "source_monitor_sources_table",
|
|
179
|
+
default_order: :desc,
|
|
180
|
+
secondary: ["created_at desc"],
|
|
181
|
+
html_options: {
|
|
182
|
+
class: "inline-flex items-center gap-1 text-xs font-semibold uppercase tracking-wide text-slate-600 hover:text-slate-900 focus:outline-none"
|
|
183
|
+
}
|
|
184
|
+
) %>
|
|
185
|
+
<span class="text-[11px] text-slate-400" aria-hidden="true"><%= table_sort_arrow(@q, :new_items_per_day) %></span>
|
|
186
|
+
</span>
|
|
187
|
+
</th>
|
|
188
|
+
<th scope="col"
|
|
189
|
+
class="px-6 py-3"
|
|
190
|
+
data-sort-column="avg_feed_words"
|
|
191
|
+
aria-sort="<%= table_sort_aria(@q, :avg_feed_words) %>">
|
|
192
|
+
<span class="inline-flex items-center gap-1">
|
|
193
|
+
<%= table_sort_link(
|
|
194
|
+
@q,
|
|
195
|
+
:avg_feed_words,
|
|
196
|
+
"Avg Feed Words",
|
|
197
|
+
frame: "source_monitor_sources_table",
|
|
198
|
+
default_order: :desc,
|
|
199
|
+
secondary: ["created_at desc"],
|
|
200
|
+
html_options: {
|
|
201
|
+
class: "inline-flex items-center gap-1 text-xs font-semibold uppercase tracking-wide text-slate-600 hover:text-slate-900 focus:outline-none"
|
|
202
|
+
}
|
|
203
|
+
) %>
|
|
204
|
+
<span class="text-[11px] text-slate-400" aria-hidden="true"><%= table_sort_arrow(@q, :avg_feed_words) %></span>
|
|
205
|
+
</span>
|
|
206
|
+
</th>
|
|
207
|
+
<th scope="col"
|
|
208
|
+
class="px-6 py-3"
|
|
209
|
+
data-sort-column="avg_scraped_words"
|
|
210
|
+
aria-sort="<%= table_sort_aria(@q, :avg_scraped_words) %>">
|
|
211
|
+
<span class="inline-flex items-center gap-1">
|
|
212
|
+
<%= table_sort_link(
|
|
213
|
+
@q,
|
|
214
|
+
:avg_scraped_words,
|
|
215
|
+
"Avg Scraped Words",
|
|
216
|
+
frame: "source_monitor_sources_table",
|
|
217
|
+
default_order: :desc,
|
|
218
|
+
secondary: ["created_at desc"],
|
|
219
|
+
html_options: {
|
|
220
|
+
class: "inline-flex items-center gap-1 text-xs font-semibold uppercase tracking-wide text-slate-600 hover:text-slate-900 focus:outline-none"
|
|
221
|
+
}
|
|
222
|
+
) %>
|
|
223
|
+
<span class="text-[11px] text-slate-400" aria-hidden="true"><%= table_sort_arrow(@q, :avg_scraped_words) %></span>
|
|
224
|
+
</span>
|
|
225
|
+
</th>
|
|
226
|
+
<th scope="col"
|
|
227
|
+
class="px-6 py-3"
|
|
228
|
+
data-sort-column="last_fetched_at"
|
|
229
|
+
aria-sort="<%= table_sort_aria(@q, :last_fetched_at) %>">
|
|
230
|
+
<span class="inline-flex items-center gap-1">
|
|
231
|
+
<%= table_sort_link(
|
|
232
|
+
@q,
|
|
233
|
+
:last_fetched_at,
|
|
234
|
+
"Last Fetch",
|
|
235
|
+
frame: "source_monitor_sources_table",
|
|
236
|
+
default_order: :desc,
|
|
237
|
+
secondary: ["created_at desc"],
|
|
238
|
+
html_options: {
|
|
239
|
+
class: "inline-flex items-center gap-1 text-xs font-semibold uppercase tracking-wide text-slate-600 hover:text-slate-900 focus:outline-none"
|
|
240
|
+
}
|
|
241
|
+
) %>
|
|
242
|
+
<span class="text-[11px] text-slate-400" aria-hidden="true"><%= table_sort_arrow(@q, :last_fetched_at) %></span>
|
|
243
|
+
</span>
|
|
244
|
+
</th>
|
|
245
|
+
<th scope="col" class="px-6 py-3"></th>
|
|
246
|
+
</tr>
|
|
247
|
+
</thead>
|
|
248
|
+
<tbody id="source_monitor_sources_table_body" class="divide-y divide-slate-100 text-slate-700">
|
|
249
|
+
<%= render partial: "source_monitor/sources/row",
|
|
250
|
+
collection: @sources,
|
|
251
|
+
as: :source,
|
|
252
|
+
locals: {
|
|
253
|
+
item_activity_rates: @item_activity_rates,
|
|
254
|
+
avg_feed_word_counts: @avg_feed_word_counts,
|
|
255
|
+
avg_scraped_word_counts: @avg_scraped_word_counts,
|
|
256
|
+
search_params: @search_params,
|
|
257
|
+
scrape_candidate_ids: @scrape_candidate_ids
|
|
258
|
+
} %>
|
|
221
259
|
|
|
222
|
-
|
|
223
|
-
|
|
224
|
-
|
|
225
|
-
<span class="inline-flex items-center rounded-md border border-slate-200 px-3 py-2 text-sm font-medium text-slate-300">Previous</span>
|
|
226
|
-
<% end %>
|
|
260
|
+
<%= render("source_monitor/sources/empty_state_row") if @sources.blank? %>
|
|
261
|
+
</tbody>
|
|
262
|
+
</table>
|
|
227
263
|
|
|
228
|
-
|
|
229
|
-
|
|
230
|
-
|
|
231
|
-
|
|
232
|
-
|
|
264
|
+
<div data-select-all-target="actionBar" class="hidden sticky bottom-0 border-t border-slate-200 bg-white px-4 py-3 shadow-md">
|
|
265
|
+
<div class="flex items-center justify-between">
|
|
266
|
+
<span class="text-sm text-slate-700">
|
|
267
|
+
<span data-select-all-target="count">0</span> source(s) selected
|
|
268
|
+
</span>
|
|
269
|
+
<button type="button"
|
|
270
|
+
data-action="modal#open"
|
|
271
|
+
class="inline-flex items-center rounded-md bg-violet-600 px-4 py-2 text-sm font-semibold text-white shadow hover:bg-violet-500">
|
|
272
|
+
Enable Scraping
|
|
273
|
+
</button>
|
|
274
|
+
</div>
|
|
233
275
|
</div>
|
|
234
|
-
|
|
276
|
+
|
|
277
|
+
<%= render "source_monitor/sources/bulk_scrape_enable_modal" %>
|
|
278
|
+
<% end %>
|
|
279
|
+
<% extra_params = {} %>
|
|
280
|
+
<% extra_params[:q] = @search_params if @search_params.present? %>
|
|
281
|
+
<% extra_params[:per_page] = params[:per_page] if params[:per_page].present? %>
|
|
282
|
+
<%= render "source_monitor/shared/pagination",
|
|
283
|
+
paginator_result: @paginator,
|
|
284
|
+
base_path: source_monitor.sources_path,
|
|
285
|
+
extra_params: extra_params,
|
|
286
|
+
turbo_frame: "source_monitor_sources_table" %>
|
|
235
287
|
<% end %>
|
|
236
288
|
</div>
|
|
237
289
|
</div>
|
data/config/brakeman.ignore
CHANGED
|
@@ -10,8 +10,18 @@
|
|
|
10
10
|
"line": 77,
|
|
11
11
|
"code": "OpenSSL::SSL::SSLContext.new.verify_mode = OpenSSL::SSL::VERIFY_NONE",
|
|
12
12
|
"note": "Intentional: AIA resolver must connect without verification to fetch the leaf certificate from servers with broken certificate chains. This is the core purpose of the module -- it only uses VERIFY_NONE to read the cert, never to transmit data."
|
|
13
|
+
},
|
|
14
|
+
{
|
|
15
|
+
"warning_type": "Mass Assignment",
|
|
16
|
+
"warning_code": 70,
|
|
17
|
+
"fingerprint": "b4702d88859ff3c4f9e954eac9335d003834c1d37797c54587a0dc9b3b76bf8b",
|
|
18
|
+
"check_name": "MassAssignment",
|
|
19
|
+
"message": "Specify exact keys allowed for mass assignment instead of using `permit!` which allows any keys",
|
|
20
|
+
"file": "app/controllers/source_monitor/dashboard_controller.rb",
|
|
21
|
+
"line": 33,
|
|
22
|
+
"note": "Safe: schedule_pages params are only used as page numbers for Turbo Frame pagination of fetch schedule buckets. No model attributes are assigned from these values."
|
|
13
23
|
}
|
|
14
24
|
],
|
|
15
|
-
"updated": "2026-
|
|
25
|
+
"updated": "2026-03-13",
|
|
16
26
|
"brakeman_version": "8.0.2"
|
|
17
27
|
}
|
data/config/routes.rb
CHANGED
|
@@ -7,6 +7,9 @@ SourceMonitor::Engine.routes.draw do
|
|
|
7
7
|
resources :logs, only: :index
|
|
8
8
|
resources :fetch_logs, only: :show
|
|
9
9
|
resources :scrape_logs, only: :show
|
|
10
|
+
resources :import_histories, only: [] do
|
|
11
|
+
resource :dismissal, only: :create, controller: "import_history_dismissals"
|
|
12
|
+
end
|
|
10
13
|
resources :import_sessions, path: "import_opml", only: %i[new create show update destroy] do
|
|
11
14
|
member do
|
|
12
15
|
get "steps/:step", action: :show, as: :step
|
|
@@ -16,6 +19,7 @@ SourceMonitor::Engine.routes.draw do
|
|
|
16
19
|
resources :items, only: %i[index show] do
|
|
17
20
|
post :scrape, on: :member
|
|
18
21
|
end
|
|
22
|
+
resources :bulk_scrape_enablements, only: :create
|
|
19
23
|
resources :sources do
|
|
20
24
|
resource :fetch, only: :create, controller: "source_fetches"
|
|
21
25
|
resource :retry, only: :create, controller: "source_retries"
|
|
@@ -23,5 +27,6 @@ SourceMonitor::Engine.routes.draw do
|
|
|
23
27
|
resource :health_check, only: :create, controller: "source_health_checks"
|
|
24
28
|
resource :health_reset, only: :create, controller: "source_health_resets"
|
|
25
29
|
resource :favicon_fetch, only: :create, controller: "source_favicon_fetches"
|
|
30
|
+
resource :scrape_test, only: :create, controller: "source_scrape_tests"
|
|
26
31
|
end
|
|
27
32
|
end
|
|
@@ -0,0 +1,11 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
|
|
3
|
+
class AddConsecutiveFetchFailuresToSources < ActiveRecord::Migration[8.0]
|
|
4
|
+
def change
|
|
5
|
+
add_column :sourcemon_sources, :consecutive_fetch_failures, :integer, default: 0, null: false
|
|
6
|
+
|
|
7
|
+
add_index :sourcemon_sources, :consecutive_fetch_failures,
|
|
8
|
+
where: "consecutive_fetch_failures > 0",
|
|
9
|
+
name: "index_sources_on_consecutive_failures"
|
|
10
|
+
end
|
|
11
|
+
end
|
|
@@ -0,0 +1,20 @@
|
|
|
1
|
+
class SimplifyHealthStatusValues < ActiveRecord::Migration[8.0]
|
|
2
|
+
def up
|
|
3
|
+
execute <<~SQL
|
|
4
|
+
UPDATE sourcemon_sources SET health_status = 'working' WHERE health_status IN ('healthy', 'auto_paused', 'unknown')
|
|
5
|
+
SQL
|
|
6
|
+
execute <<~SQL
|
|
7
|
+
UPDATE sourcemon_sources SET health_status = 'failing' WHERE health_status IN ('warning', 'critical')
|
|
8
|
+
SQL
|
|
9
|
+
# 'declining' and 'improving' remain unchanged
|
|
10
|
+
end
|
|
11
|
+
|
|
12
|
+
def down
|
|
13
|
+
execute <<~SQL
|
|
14
|
+
UPDATE sourcemon_sources SET health_status = 'healthy' WHERE health_status = 'working'
|
|
15
|
+
SQL
|
|
16
|
+
execute <<~SQL
|
|
17
|
+
UPDATE sourcemon_sources SET health_status = 'critical' WHERE health_status = 'failing'
|
|
18
|
+
SQL
|
|
19
|
+
end
|
|
20
|
+
end
|
data/docs/configuration.md
CHANGED
|
@@ -86,6 +86,14 @@ config.scrapers.register(:custom, "MyApp::Scrapers::Premium" )
|
|
|
86
86
|
|
|
87
87
|
Adapters receive merged settings (`default -> source -> invocation`), and must return a `SourceMonitor::Scrapers::Result` object. Use `config.scrapers.unregister(:custom)` to remove overrides.
|
|
88
88
|
|
|
89
|
+
## Scraping Settings
|
|
90
|
+
|
|
91
|
+
`config.scraping` controls scrape concurrency and recommendations.
|
|
92
|
+
|
|
93
|
+
- `max_in_flight_per_source` – max concurrent scrape jobs per source (`nil` = unlimited, default `nil`)
|
|
94
|
+
- `max_bulk_batch_size` – max items per bulk scrape enqueue (default `100`)
|
|
95
|
+
- `scrape_recommendation_threshold` – minimum average feed word count below which a source is recommended for scraping (default `200`)
|
|
96
|
+
|
|
89
97
|
## Events & Item Processors
|
|
90
98
|
|
|
91
99
|
Respond to lifecycle events without monkey patching:
|
|
@@ -147,7 +155,7 @@ Handlers can be symbols (invoked on the controller) or callables. Return `false`
|
|
|
147
155
|
`config.health` tunes automatic pause/resume heuristics.
|
|
148
156
|
|
|
149
157
|
- `window_size` – number of fetch attempts to evaluate (default `20`)
|
|
150
|
-
- `healthy_threshold`
|
|
158
|
+
- `healthy_threshold` – ratio that drives the "working" badge
|
|
151
159
|
- `auto_pause_threshold` / `auto_resume_threshold` – percentages that trigger automatic toggling
|
|
152
160
|
- `auto_pause_cooldown_minutes` – grace period before re-enabling a source
|
|
153
161
|
|
data/docs/troubleshooting.md
CHANGED
|
@@ -109,6 +109,15 @@ This guide lists common issues you might encounter while installing, upgrading,
|
|
|
109
109
|
- Fix by running `npm install` followed by `npm run build` inside the engine root so that `app/assets/builds/source_monitor/application.css` and `application.js` exist. The Rake task `app:source_monitor:assets:build` wraps the same scripts for CI usage.
|
|
110
110
|
- When the UI is still unstyled, confirm the dummy app can read the namespaced asset directories noted in `.ai/engine-asset-configuration.md:32-44` and restart `bin/dev` so the CSS/JS watchers reconnect.
|
|
111
111
|
|
|
112
|
+
## 14. Feed Returns Cloudflare Challenge Page
|
|
113
|
+
|
|
114
|
+
- **Symptoms:** Fetch logs show HTML containing "Checking your browser" or "Just a moment..." instead of feed XML. The source may display a "Blocked" badge.
|
|
115
|
+
- **How it works:** SourceMonitor automatically detects Cloudflare challenge responses and attempts cookie replay with UA rotation on subsequent fetches. No configuration is needed -- this is enabled by default.
|
|
116
|
+
- **If fetches remain blocked:** Some sites employ aggressive bot mitigation that cannot be bypassed with cookie replay alone. In these cases:
|
|
117
|
+
- Try setting a custom `user_agent` on the source's custom headers to match a specific browser.
|
|
118
|
+
- Consider whether the site offers an alternative feed URL that is not behind Cloudflare.
|
|
119
|
+
- Check the fetch log's `error_category` field -- a value of `blocked` confirms Cloudflare detection triggered.
|
|
120
|
+
|
|
112
121
|
## Still Stuck?
|
|
113
122
|
|
|
114
123
|
Collect the following and open an issue or start a discussion:
|
data/docs/upgrade.md
CHANGED
|
@@ -46,6 +46,37 @@ If a removed option raises an error (`SourceMonitor::DeprecatedOptionError`), yo
|
|
|
46
46
|
|
|
47
47
|
## Version-Specific Notes
|
|
48
48
|
|
|
49
|
+
### Upgrading to 0.11.0
|
|
50
|
+
|
|
51
|
+
**What changed:**
|
|
52
|
+
- Health status values simplified from 7 (`healthy`, `warning`, `critical`, `declining`, `improving`, `auto_paused`, `unknown`) to 4 (`working`, `declining`, `improving`, `failing`).
|
|
53
|
+
- `warning_threshold` configuration setting removed entirely.
|
|
54
|
+
- Auto-pause is now tracked as operational state (via `auto_paused_at`/`auto_paused_until` columns) rather than as a health status value. Sources can be both "failing AND auto-paused".
|
|
55
|
+
- New `determine_status` decision tree uses rate thresholds and streak detection.
|
|
56
|
+
- Dashboard and sources index updated with new badge colors: working (green), declining (yellow), improving (sky), failing (rose).
|
|
57
|
+
- New `consecutive_fetch_failures` column on sources for streak-based health detection.
|
|
58
|
+
- New `error_category` column on fetch logs for classifying failure types (e.g., timeout, DNS, blocked).
|
|
59
|
+
- New `config.scraping.scrape_recommendation_threshold` setting (default 200) controls the word-count threshold for scrape recommendations on the dashboard.
|
|
60
|
+
- Dashboard pagination for sources and items lists.
|
|
61
|
+
- Automatic Cloudflare bypass via cookie replay and UA rotation (no configuration needed).
|
|
62
|
+
- Smart scrape recommendations widget on the dashboard highlights sources that may benefit from scraping.
|
|
63
|
+
|
|
64
|
+
**Upgrade steps:**
|
|
65
|
+
```bash
|
|
66
|
+
bundle update source_monitor
|
|
67
|
+
bin/rails source_monitor:upgrade
|
|
68
|
+
bin/rails db:migrate
|
|
69
|
+
```
|
|
70
|
+
|
|
71
|
+
**Notes:**
|
|
72
|
+
- **Action required:** If your initializer sets `config.health.warning_threshold`, remove that line. The setting no longer exists and will raise an error.
|
|
73
|
+
- The data migration automatically remaps existing health_status values: `healthy`/`auto_paused`/`unknown` become `working`, `warning`/`critical` become `failing`. This is reversible.
|
|
74
|
+
- If your host app queries `health_status` directly (e.g., `Source.where(health_status: "healthy")`), update those queries to use the new values.
|
|
75
|
+
- The `auto_paused?` method and `auto_paused_at`/`auto_paused_until` columns still exist and work the same way. Only the `health_status` column values changed.
|
|
76
|
+
- `healthy_threshold` config setting still exists but now drives the `working` status (renamed from `healthy`).
|
|
77
|
+
- The `consecutive_fetch_failures` and `error_category` columns are added via migrations and require no configuration.
|
|
78
|
+
- Cloudflare bypass is automatic -- sources that return Cloudflare challenge pages will show a "Blocked" badge and the engine retries with cookie replay and UA rotation.
|
|
79
|
+
|
|
49
80
|
### Upgrading to 0.10.0 (from 0.9.x)
|
|
50
81
|
|
|
51
82
|
**What changed:**
|