source_monitor 0.9.1 → 0.10.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (37) hide show
  1. checksums.yaml +4 -4
  2. data/.claude/commands/release.md +67 -14
  3. data/.claude/skills/sm-configuration-setting/reference/settings-catalog.md +7 -2
  4. data/.claude/skills/sm-configure/reference/configuration-reference.md +13 -2
  5. data/.claude/skills/sm-host-setup/reference/initializer-template.md +4 -0
  6. data/.claude/skills/sm-job/reference/job-conventions.md +9 -7
  7. data/.claude/skills/sm-pipeline-stage/reference/completion-handlers.md +9 -1
  8. data/.claude/skills/sm-upgrade/reference/version-history.md +21 -0
  9. data/.rubocop.yml +1 -0
  10. data/CHANGELOG.md +27 -0
  11. data/CLAUDE.md +2 -4
  12. data/Gemfile.lock +1 -1
  13. data/README.md +6 -6
  14. data/VERSION +1 -1
  15. data/app/jobs/source_monitor/download_content_images_job.rb +1 -1
  16. data/app/jobs/source_monitor/favicon_fetch_job.rb +1 -1
  17. data/app/jobs/source_monitor/import_opml_job.rb +1 -1
  18. data/app/jobs/source_monitor/import_session_health_check_job.rb +1 -1
  19. data/app/jobs/source_monitor/item_cleanup_job.rb +1 -1
  20. data/app/jobs/source_monitor/log_cleanup_job.rb +1 -1
  21. data/app/jobs/source_monitor/schedule_fetches_job.rb +1 -1
  22. data/app/jobs/source_monitor/source_health_check_job.rb +1 -1
  23. data/docs/configuration.md +11 -2
  24. data/docs/deployment.md +5 -1
  25. data/docs/setup.md +2 -2
  26. data/docs/troubleshooting.md +20 -6
  27. data/docs/upgrade.md +27 -0
  28. data/lib/source_monitor/configuration/fetching_settings.rb +5 -1
  29. data/lib/source_monitor/configuration.rb +8 -0
  30. data/lib/source_monitor/fetching/completion/follow_up_handler.rb +7 -1
  31. data/lib/source_monitor/fetching/feed_fetcher/adaptive_interval.rb +2 -1
  32. data/lib/source_monitor/fetching/fetch_runner.rb +14 -5
  33. data/lib/source_monitor/fetching/stalled_fetch_reconciler.rb +3 -5
  34. data/lib/source_monitor/scheduler.rb +9 -5
  35. data/lib/source_monitor/version.rb +1 -1
  36. data/lib/tasks/stagger_fetch_times.rake +37 -0
  37. metadata +3 -2
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: c3dd7577c86e15ec9926a631998d423b1d2fd1bc18cbdfc83e8d7dc57b6be365
4
- data.tar.gz: 65fc2870418c04d3741a98404558ffdd3e8f5a901294681446f337b662dd50f2
3
+ metadata.gz: 303d253e46391a54167ab1396f8f855228fb4cd867dbcf22614c9aa75b9b2e30
4
+ data.tar.gz: 19b54173bc76cb68615b44dd93fe1ac525e9260da83e4dbfa5311e9c71ccb73a
5
5
  SHA512:
6
- metadata.gz: 115775737ef8f40ea9323932d58e941fccb9b6903371cbdab80d62bcdfbf31d85f44c5eacbcc7b536489483972a2b02e7e547454715dae9af8b19083dce62a62
7
- data.tar.gz: 2a285b946a069420c28a1588f580f721f0ea23da6b5c002a0dfb8d149c338c120291e4ad8e77609b65f59ae55bfadec7db6b85c49f92d2097616588d9425c363
6
+ metadata.gz: 2ff0ad53a04b7685490ec6d0ae39d48906dcc92a9b16062a7cc056316dcef88c38a039372551f8a9ef2e0f3c9236a3d0a66aa150127be4d902d85c4d35a42230
7
+ data.tar.gz: 11c424108aece6ae5b5866bebc79df7d1972f5df2c0091a3fd4720e4779284cefbb7222ff7fa31deae90b5777650ef5593ade5f0140aa9699e314a0b36e082a2
@@ -1,6 +1,6 @@
1
1
  # Release: PR, CI, Merge, and Gem Build
2
2
 
3
- Orchestrate a full release cycle for the source_monitor gem. This command handles changelog generation, version bumping, PR creation, CI monitoring, auto-merge on success, release tagging, and gem build with push instructions.
3
+ Orchestrate a full release cycle for the source_monitor gem. This command handles changelog generation, documentation audit, version bumping, PR creation, CI monitoring, auto-merge on success, release tagging, and gem build with push instructions.
4
4
 
5
5
  ## Inputs
6
6
 
@@ -26,6 +26,7 @@ These are real issues encountered in previous releases. Each step below accounts
26
26
  9. **ESLint browser globals**: Any JS file using browser APIs (MutationObserver, requestAnimationFrame, cancelAnimationFrame, IntersectionObserver, etc.) MUST declare them with a `/* global ... */` comment at the top. ESLint's `no-undef` rule in CI will reject them otherwise.
27
27
  10. **Diff coverage rescue paths**: Every `rescue`/fallback/error handling branch in changed source code needs test coverage. Common blind spots: `rescue StandardError => e` logging, `rescue URI::InvalidURIError` returning nil, fallback `false` returns. Write targeted tests for these BEFORE creating the release commit.
28
28
  11. **Zsh glob nomatch**: Commands like `rm -f *.gem` fail in zsh when no files match. Always use `rm -f *.gem 2>/dev/null || true` or check existence first with `ls`.
29
+ 12. **Documentation drift**: Features, config options, and behavioral changes often land across milestone work without corresponding doc updates. The Documentation Audit step (Step 4) catches this -- check `docs/`, `README.md`, skill reference files (`sm-*/reference/`), and the initializer template against the actual source code. In v0.9.x, 14 files needed updates that would have been missed without this step.
29
30
 
30
31
  ## Step 1: Git Hygiene
31
32
 
@@ -113,7 +114,58 @@ The changelog follows [Keep a Changelog](https://keepachangelog.com/en/1.1.0/) f
113
114
  - Insert the new versioned entry immediately after the `## [Unreleased]` block and before the previous release entry.
114
115
  - Preserve all existing entries below.
115
116
 
116
- ## Step 4: Sync Gemfile.lock
117
+ ## Step 4: Documentation Audit
118
+
119
+ Verify that all project documentation reflects the current state of the codebase. Changes made since the last release (or during milestone work) may have introduced features, configuration options, bug fixes, or behavioral changes that need to be documented.
120
+
121
+ 1. **Gather what changed** since the last release tag:
122
+ ```
123
+ git diff vPREVIOUS..HEAD --name-only -- lib/ app/ config/
124
+ ```
125
+ This shows which source files changed. Use this to identify features/fixes that may need documentation.
126
+
127
+ 2. **Check these documentation files against the changes:**
128
+
129
+ | File | What to verify |
130
+ |------|---------------|
131
+ | `CHANGELOG.md` | Has an `[Unreleased]` or versioned entry covering all user-facing changes |
132
+ | `README.md` | Version references match, feature descriptions current, gem version in install instructions |
133
+ | `docs/configuration.md` | All config options documented, new settings included, env vars listed |
134
+ | `docs/deployment.md` | Worker/queue descriptions match current queues and job assignments |
135
+ | `docs/troubleshooting.md` | Covers known failure modes from recent changes |
136
+ | `docs/upgrade.md` | Has upgrade section for this version with action items |
137
+ | `docs/setup.md` | Setup steps still accurate |
138
+
139
+ 3. **Check skills reference files** (engine-specific documentation for Claude Code):
140
+
141
+ | Skill Reference | What to verify |
142
+ |----------------|---------------|
143
+ | `sm-configure/reference/configuration-reference.md` | All config settings and their defaults |
144
+ | `sm-configuration-setting/reference/settings-catalog.md` | Settings catalog with types, defaults, descriptions |
145
+ | `sm-job/reference/job-conventions.md` | Queue names, job assignments, concurrency defaults |
146
+ | `sm-pipeline-stage/reference/completion-handlers.md` | Pipeline handler code matches actual implementation |
147
+ | `sm-upgrade/reference/version-history.md` | Version transition notes for the new release |
148
+ | `sm-host-setup/reference/initializer-template.md` | Initializer template shows all configurable options |
149
+
150
+ 4. **For each file that is stale or missing coverage**:
151
+ - Update it to reflect the current codebase behavior.
152
+ - For config docs: read the actual settings classes in `lib/source_monitor/configuration/` to verify defaults.
153
+ - For job docs: read `app/jobs/source_monitor/` to verify queue assignments.
154
+ - For upgrade notes: summarize breaking changes, new config, and action items.
155
+
156
+ 5. **If all documentation is already up to date**, report:
157
+ ```
158
+ Documentation Audit: All files current. No updates needed.
159
+ ```
160
+ If updates were made, report:
161
+ ```
162
+ Documentation Audit: Updated N files.
163
+ - <file>: <what was updated>
164
+ ```
165
+
166
+ Do NOT commit documentation updates separately -- they will be included in the single release commit in Step 7.
167
+
168
+ ## Step 5: Sync Gemfile.lock
117
169
 
118
170
  **CRITICAL**: After updating `version.rb`, the gemspec version changes and `Gemfile.lock` becomes stale.
119
171
 
@@ -121,7 +173,7 @@ The changelog follows [Keep a Changelog](https://keepachangelog.com/en/1.1.0/) f
121
173
  2. Verify the output shows the new version: `Using source_monitor X.Y.Z (was X.Y.Z-1)`.
122
174
  3. If `bundle install` fails, resolve the issue before proceeding.
123
175
 
124
- ## Step 5: Local Pre-flight Checks
176
+ ## Step 6: Local Pre-flight Checks
125
177
 
126
178
  **CRITICAL**: Run the FULL local CI equivalent BEFORE creating the release branch and pushing. Each CI failure → fix → amend → force-push cycle wastes ~5 minutes. In v0.7.0, skipping this step caused two wasted CI roundtrips. In v0.8.0, skipping ESLint and diff coverage pre-checks caused another two wasted cycles.
127
179
 
@@ -142,9 +194,9 @@ The changelog follows [Keep a Changelog](https://keepachangelog.com/en/1.1.0/) f
142
194
  - Browser globals (MutationObserver, requestAnimationFrame, cancelAnimationFrame, IntersectionObserver, etc.) must be declared with `/* global ... */` comments at the top of the file.
143
195
  - Missing `/* global */` declarations cause ESLint `no-undef` failures.
144
196
 
145
- Only proceed to Step 6 when ALL five checks pass.
197
+ Only proceed to Step 7 when ALL five checks pass.
146
198
 
147
- ## Step 6: Create Release Branch with Single Squashed Commit
199
+ ## Step 7: Create Release Branch with Single Squashed Commit
148
200
 
149
201
  **IMPORTANT**: All release changes MUST be in a single commit on the release branch. This avoids pre-push hook issues where individual commits are checked for VERSION changes.
150
202
 
@@ -162,7 +214,7 @@ Only proceed to Step 6 when ALL five checks pass.
162
214
  - If the pre-push hook blocks with a false positive (e.g., VBW files dirty in working tree despite being gitignored), use `git push -u --no-verify origin release/vX.Y.Z`. This is safe because we've verified VERSION is in the commit.
163
215
  5. If the push fails for other reasons, diagnose and fix before proceeding.
164
216
 
165
- ## Step 7: Create PR
217
+ ## Step 8: Create PR
166
218
 
167
219
  1. Create the PR using `gh pr create`:
168
220
  - Title: `Release vX.Y.Z`
@@ -175,6 +227,7 @@ Only proceed to Step 6 when ALL five checks pass.
175
227
  ### Release Checklist
176
228
  - [x] Version bumped in `lib/source_monitor/version.rb` and `VERSION`
177
229
  - [x] CHANGELOG.md updated
230
+ - [x] Documentation audited and updated
178
231
  - [x] Gemfile.lock synced
179
232
  - [ ] CI passes (lint, security, test, release_verification)
180
233
 
@@ -184,7 +237,7 @@ Only proceed to Step 6 when ALL five checks pass.
184
237
  - Base: `main`
185
238
  2. Report the PR URL to the user.
186
239
 
187
- ## Step 8: Monitor CI Pipeline
240
+ ## Step 9: Monitor CI Pipeline
188
241
 
189
242
  Poll the CI status using repeated `gh pr checks <PR_NUMBER>` calls. The CI has 4 required jobs: `lint`, `security`, `test`, `release_verification` (release_verification only runs after test passes).
190
243
 
@@ -195,7 +248,7 @@ Poll the CI status using repeated `gh pr checks <PR_NUMBER>` calls. The CI has 4
195
248
 
196
249
  ### If CI PASSES (all checks green):
197
250
 
198
- Continue to Step 9. If Step 5 (local pre-flight) was done properly, CI should pass on the first attempt.
251
+ Continue to Step 10. If Step 6 (local pre-flight) was done properly, CI should pass on the first attempt.
199
252
 
200
253
  ### If CI FAILS:
201
254
 
@@ -205,8 +258,8 @@ Continue to Step 9. If Step 5 (local pre-flight) was done properly, CI should pa
205
258
  gh run view <RUN_ID> --log-failed | tail -80
206
259
  ```
207
260
  3. **Common failure: diff coverage** -- If the `test` job fails on "Enforce diff coverage", it means changed source lines lack test coverage. Read the error to identify uncovered files/lines, write tests, and add them to the release commit.
208
- 4. **Common failure: Gemfile.lock frozen** -- If `bundle install` fails in CI with "frozen mode", you forgot to run `bundle install` locally (Step 4). Amend the commit with the updated lockfile.
209
- 5. **Common failure: RuboCop lint** -- If the `lint` job fails, a RuboCop violation slipped through. This should have been caught in Step 5.
261
+ 4. **Common failure: Gemfile.lock frozen** -- If `bundle install` fails in CI with "frozen mode", you forgot to run `bundle install` locally (Step 5). Amend the commit with the updated lockfile.
262
+ 5. **Common failure: RuboCop lint** -- If the `lint` job fails, a RuboCop violation slipped through. This should have been caught in Step 6.
210
263
  6. **IMPORTANT: When fixing CI failures, run ALL local checks again before re-pushing.** Don't just fix the one failure — run `bin/rubocop` AND `PARALLEL_WORKERS=1 bin/rails test` to catch cascading issues. In v0.7.0, fixing a diff coverage failure introduced a RuboCop violation, requiring a third CI cycle.
211
264
  7. Present failure details to the user and ask what to do:
212
265
  - "Fix the issues and re-push" -- Fix issues, run ALL local checks (rubocop + tests), amend the commit (`git commit --amend --no-edit`), force push (`git push --force-with-lease --no-verify origin release/vX.Y.Z`), and restart CI monitoring.
@@ -215,7 +268,7 @@ Continue to Step 9. If Step 5 (local pre-flight) was done properly, CI should pa
215
268
 
216
269
  **Note on force pushes**: When force-pushing the release branch after amending, always use `--no-verify` because the pre-push hook will see the diff between old and new branch tips, and `VERSION` won't appear as changed (it's the same in both). This is expected and safe.
217
270
 
218
- ## Step 9: Auto-Merge PR
271
+ ## Step 10: Auto-Merge PR
219
272
 
220
273
  Once CI is green:
221
274
 
@@ -230,7 +283,7 @@ Once CI is green:
230
283
 
231
284
  3. Report: "PR #N merged successfully."
232
285
 
233
- ## Step 10: Tag the Release
286
+ ## Step 11: Tag the Release
234
287
 
235
288
  1. Verify you're on main and synced with origin.
236
289
  2. Create an annotated tag:
@@ -244,7 +297,7 @@ Once CI is green:
244
297
  ```
245
298
  5. Report the release URL.
246
299
 
247
- ## Step 11: Build the Gem
300
+ ## Step 12: Build the Gem
248
301
 
249
302
  1. Clean any old gem files. **Note**: zsh fails on `rm -f *.gem` when no files match due to `nomatch`. Use:
250
303
  ```
@@ -254,7 +307,7 @@ Once CI is green:
254
307
  3. Verify the gem was built: check for `source_monitor-X.Y.Z.gem` in the project root.
255
308
  4. Show the file size: `ls -la source_monitor-X.Y.Z.gem`
256
309
 
257
- ## Step 12: Gem Push Instructions
310
+ ## Step 13: Gem Push Instructions
258
311
 
259
312
  Present the final instructions to the user:
260
313
 
@@ -18,9 +18,12 @@ All configuration sections with their attributes, defaults, and types.
18
18
  | `mission_control_enabled` | Boolean | `false` | Enable Mission Control integration |
19
19
  | `mission_control_dashboard_path` | String/Proc/nil | `nil` | Path or callable for Mission Control |
20
20
 
21
+ | `maintenance_queue_name` | String | `"source_monitor_maintenance"` | Queue name for maintenance jobs |
22
+ | `maintenance_queue_concurrency` | Integer | `1` | Max concurrent maintenance workers |
23
+
21
24
  **Methods:**
22
- - `queue_name_for(:fetch)` / `queue_name_for(:scrape)` -- Returns prefixed queue name
23
- - `concurrency_for(:fetch)` / `concurrency_for(:scrape)` -- Returns concurrency limit
25
+ - `queue_name_for(:fetch)` / `queue_name_for(:scrape)` / `queue_name_for(:maintenance)` -- Returns prefixed queue name
26
+ - `concurrency_for(:fetch)` / `concurrency_for(:scrape)` / `concurrency_for(:maintenance)` -- Returns concurrency limit
24
27
 
25
28
  ---
26
29
 
@@ -58,6 +61,8 @@ Has `reset!` method.
58
61
  | `decrease_factor` | Float | `0.75` | Multiplier when content changed |
59
62
  | `failure_increase_factor` | Float | `1.5` | Multiplier on fetch failure |
60
63
  | `jitter_percent` | Float | `0.1` | Random jitter (10%) |
64
+ | `scheduler_batch_size` | Integer | `25` | Max sources per scheduler run |
65
+ | `stale_timeout_minutes` | Integer | `5` | Minutes before stuck "fetching" source is reset |
61
66
 
62
67
  Has `reset!` method. All attributes are plain `attr_accessor`.
63
68
 
@@ -20,12 +20,15 @@ Defined on `SourceMonitor::Configuration`:
20
20
  | `mission_control_enabled` | Boolean | `false` | Show Mission Control link on dashboard |
21
21
  | `mission_control_dashboard_path` | String/Proc/nil | `nil` | Path or callable returning MC URL |
22
22
 
23
+ | `maintenance_queue_name` | String | `"source_monitor_maintenance"` | Queue name for maintenance jobs |
24
+ | `maintenance_queue_concurrency` | Integer | `1` | Advisory concurrency for maintenance queue |
25
+
23
26
  ### Methods
24
27
 
25
28
  | Method | Signature | Description |
26
29
  |---|---|---|
27
- | `queue_name_for` | `(role) -> String` | Returns resolved queue name with host prefix |
28
- | `concurrency_for` | `(role) -> Integer` | Returns concurrency for `:fetch` or `:scrape` |
30
+ | `queue_name_for` | `(role) -> String` | Returns resolved queue name with host prefix (`:fetch`, `:scrape`, or `:maintenance`) |
31
+ | `concurrency_for` | `(role) -> Integer` | Returns concurrency for `:fetch`, `:scrape`, or `:maintenance` |
29
32
 
30
33
  ---
31
34
 
@@ -70,11 +73,15 @@ Controls adaptive fetch scheduling.
70
73
  | `decrease_factor` | Float | `0.75` | Multiplier when new items arrive |
71
74
  | `failure_increase_factor` | Float | `1.5` | Multiplier on consecutive failures |
72
75
  | `jitter_percent` | Float | `0.1` | Random jitter (+/-10%, 0 disables) |
76
+ | `scheduler_batch_size` | Integer | `25` | Max sources per scheduler run |
77
+ | `stale_timeout_minutes` | Integer | `5` | Minutes before stuck "fetching" source is reset |
73
78
 
74
79
  ```ruby
75
80
  config.fetching.min_interval_minutes = 10
76
81
  config.fetching.max_interval_minutes = 720 # 12 hours
77
82
  config.fetching.jitter_percent = 0.15 # +/-15%
83
+ config.fetching.scheduler_batch_size = 50 # Increase for larger servers
84
+ config.fetching.stale_timeout_minutes = 3 # Faster recovery
78
85
  ```
79
86
 
80
87
  ---
@@ -395,4 +402,8 @@ Failed attempts are tracked in the source's `metadata` JSONB column (`favicon_la
395
402
  | `SOFT_DELETE` | Override retention strategy in rake tasks |
396
403
  | `SOURCE_IDS` / `SOURCE_ID` | Scope cleanup rake tasks to specific sources |
397
404
  | `FETCH_LOG_DAYS` / `SCRAPE_LOG_DAYS` | Retention windows for log cleanup |
405
+ | `WINDOW_MINUTES` | Time window for `stagger_fetch_times` rake task (default `10`) |
406
+ | `SOURCE_MONITOR_FETCH_CONCURRENCY` | Override fetch queue concurrency in `solid_queue.yml` |
407
+ | `SOURCE_MONITOR_SCRAPE_CONCURRENCY` | Override scrape queue concurrency in `solid_queue.yml` |
408
+ | `SOURCE_MONITOR_MAINTENANCE_CONCURRENCY` | Override maintenance queue concurrency in `solid_queue.yml` |
398
409
  | `SOURCE_MONITOR_SETUP_TELEMETRY` | Enable setup verification telemetry logging |
@@ -27,10 +27,12 @@ SourceMonitor.configure do |config|
27
27
  # Dedicated queue names. Must match entries in config/solid_queue.yml.
28
28
  config.fetch_queue_name = "source_monitor_fetch"
29
29
  config.scrape_queue_name = "source_monitor_scrape"
30
+ config.maintenance_queue_name = "source_monitor_maintenance"
30
31
 
31
32
  # Worker concurrency per queue (advisory for Solid Queue).
32
33
  config.fetch_queue_concurrency = 2
33
34
  config.scrape_queue_concurrency = 2
35
+ config.maintenance_queue_concurrency = 1
34
36
 
35
37
  # Override the job class Solid Queue uses for recurring "command" tasks.
36
38
  # config.recurring_command_job_class = "MyRecurringCommandJob"
@@ -98,6 +100,8 @@ SourceMonitor.configure do |config|
98
100
  # config.fetching.decrease_factor = 0.75 # Multiplier when items arrive
99
101
  # config.fetching.failure_increase_factor = 1.5 # Multiplier on errors
100
102
  # config.fetching.jitter_percent = 0.1 # Random jitter (+/-10%)
103
+ # config.fetching.scheduler_batch_size = 25 # Max sources per scheduler run
104
+ # config.fetching.stale_timeout_minutes = 5 # Minutes before stuck fetch is reset
101
105
 
102
106
  # ===========================================================================
103
107
  # Source Health Monitoring
@@ -37,16 +37,18 @@ SourceMonitor.queue_name(:fetch)
37
37
 
38
38
  ### Default Names
39
39
 
40
- | Role | Queue Name |
41
- |------|-----------|
42
- | `:fetch` | `source_monitor_fetch` |
43
- | `:scrape` | `source_monitor_scrape` |
40
+ | Role | Queue Name | Jobs |
41
+ |------|-----------|------|
42
+ | `:fetch` | `source_monitor_fetch` | FetchFeedJob, ScheduleFetchesJob |
43
+ | `:scrape` | `source_monitor_scrape` | ScrapeItemJob |
44
+ | `:maintenance` | `source_monitor_maintenance` | SourceHealthCheckJob, ImportSessionHealthCheckJob, ImportOpmlJob, LogCleanupJob, ItemCleanupJob, FaviconFetchJob, DownloadContentImagesJob |
44
45
 
45
46
  ### With Host App Prefix
46
47
 
47
48
  If the host app sets `ActiveJob::Base.queue_name_prefix = "myapp"`:
48
49
  - Fetch queue becomes `myapp_source_monitor_fetch`
49
50
  - Scrape queue becomes `myapp_source_monitor_scrape`
51
+ - Maintenance queue becomes `myapp_source_monitor_maintenance`
50
52
 
51
53
  ## Job Patterns by Type
52
54
 
@@ -88,7 +90,7 @@ Demonstrates options normalization pattern:
88
90
  ```ruby
89
91
  class ItemCleanupJob < ApplicationJob
90
92
  DEFAULT_BATCH_SIZE = 100
91
- source_monitor_queue :fetch
93
+ source_monitor_queue :maintenance
92
94
 
93
95
  def perform(options = nil)
94
96
  options = Jobs::CleanupOptions.normalize(options)
@@ -170,7 +172,7 @@ Demonstrates multi-strategy cascade with guard clauses:
170
172
 
171
173
  ```ruby
172
174
  class FaviconFetchJob < ApplicationJob
173
- source_monitor_queue :fetch
175
+ source_monitor_queue :maintenance
174
176
  discard_on ActiveJob::DeserializationError
175
177
 
176
178
  def perform(source_id)
@@ -196,7 +198,7 @@ Demonstrates result broadcasting:
196
198
 
197
199
  ```ruby
198
200
  class SourceHealthCheckJob < ApplicationJob
199
- source_monitor_queue :fetch
201
+ source_monitor_queue :maintenance
200
202
  discard_on ActiveJob::DeserializationError
201
203
 
202
204
  def perform(source_id)
@@ -62,12 +62,20 @@ class FollowUpHandler
62
62
  return unless should_enqueue?(source:, result:)
63
63
  result.item_processing.created_items.each do |item|
64
64
  next unless item.present? && item.scraped_at.nil?
65
- enqueuer_class.enqueue(item:, source:, job_class:, reason: :auto)
65
+ begin
66
+ enqueuer_class.enqueue(item:, source:, job_class:, reason: :auto)
67
+ rescue StandardError => error
68
+ Rails.logger.error(
69
+ "[SourceMonitor] FollowUpHandler: failed to enqueue scrape for item #{item.id}: #{error.class}: #{error.message}"
70
+ ) if defined?(Rails) && Rails.respond_to?(:logger) && Rails.logger
71
+ end
66
72
  end
67
73
  end
68
74
  end
69
75
  ```
70
76
 
77
+ Each scrape enqueue is wrapped in a per-item rescue so one failing item doesn't block the rest.
78
+
71
79
  Guard conditions:
72
80
  - Result status must be `:fetched`
73
81
  - Source must have `scraping_enabled?` and `auto_scrape?`
@@ -2,6 +2,27 @@
2
2
 
3
3
  Version-specific migration notes for each major/minor version transition. Agents should reference this file when guiding users through multi-version upgrades.
4
4
 
5
+ ## 0.9.x to next release
6
+
7
+ **Key changes:**
8
+ - New third queue: `source_monitor_maintenance` for non-fetch jobs (health checks, cleanup, favicon, images, OPML import). Keeps the fetch queue dedicated to FetchFeedJob and ScheduleFetchesJob.
9
+ - `config.maintenance_queue_name` (default `"source_monitor_maintenance"`) and `config.maintenance_queue_concurrency` (default `1`) for tuning the maintenance queue.
10
+ - `config.fetching.scheduler_batch_size` (default `25`, was hardcoded `100`) limits sources per scheduler run. Optimized for 1-CPU/2GB servers.
11
+ - `config.fetching.stale_timeout_minutes` (default `5`, was hardcoded `10`) controls stalled fetch recovery speed.
12
+ - Fixed-interval sources now get ±10% jitter on `next_fetch_at` (previously exact intervals).
13
+ - Fetch pipeline error handling hardened: DB errors in `update_source_state!` propagate instead of being silently swallowed, `ensure` block guarantees status reset from "fetching", `FollowUpHandler` rescues per-item enqueue failures.
14
+ - New rake task: `source_monitor:maintenance:stagger_fetch_times` distributes overdue sources across a configurable window (`WINDOW_MINUTES` env var, default 10).
15
+
16
+ **Action items:**
17
+ 1. **Action required:** Add the maintenance queue to your `solid_queue.yml`:
18
+ ```yaml
19
+ source_monitor_maintenance:
20
+ concurrency: <%= ENV.fetch("SOURCE_MONITOR_MAINTENANCE_CONCURRENCY", 1) %>
21
+ ```
22
+ 2. If you have many overdue sources after upgrading, run `bin/rails source_monitor:maintenance:stagger_fetch_times` to break the thundering herd.
23
+ 3. For larger servers (4+ CPUs, 8GB+), increase batch size: `config.fetching.scheduler_batch_size = 50` (or higher).
24
+ 4. All existing configuration remains valid. No breaking changes.
25
+
5
26
  ## 0.7.x to 0.8.0
6
27
 
7
28
  **Key changes:**
data/.rubocop.yml CHANGED
@@ -6,6 +6,7 @@ AllCops:
6
6
  - "test/dummy/db/schema.rb"
7
7
  - "test/tmp/**/*"
8
8
  - "test/lib/tmp/**/*"
9
+ - "examples/**/*.yml"
9
10
 
10
11
  # Overwrite or add rules to create your own house style
11
12
  #
data/CHANGELOG.md CHANGED
@@ -15,6 +15,33 @@ All notable changes to this project are documented below. The format follows [Ke
15
15
 
16
16
  - No unreleased changes yet.
17
17
 
18
+ ## [0.10.0] - 2026-02-24
19
+
20
+ ### Added
21
+
22
+ - **Maintenance queue for non-fetch jobs.** New third queue (`source_monitor_maintenance`) separates non-time-sensitive jobs from the fetch pipeline. Health checks, cleanup, favicon fetching, image downloading, and OPML import jobs now run on the maintenance queue, keeping the fetch queue dedicated to `FetchFeedJob` and `ScheduleFetchesJob`. Configure via `config.maintenance_queue_name` and `config.maintenance_queue_concurrency`.
23
+ - **Configurable scheduler batch size.** `config.fetching.scheduler_batch_size` (default `25`, was hardcoded at `100`) controls how many sources are picked up per scheduler run. Optimized for 1-CPU/2GB servers.
24
+ - **Configurable stale fetch timeout.** `config.fetching.stale_timeout_minutes` (default `5`, was hardcoded at `10`) controls how long a source can remain in "fetching" status before the stalled fetch reconciler resets it.
25
+ - **Stagger fetch times rake task.** `source_monitor:maintenance:stagger_fetch_times` distributes all currently-due sources across a configurable time window (`WINDOW_MINUTES` env var, default 10 minutes), breaking thundering herd patterns after deploys, queue stalls, or large OPML imports.
26
+
27
+ ### Fixed
28
+
29
+ - **Fetch pipeline error handling safety net.** DB update failures in `update_source_state!` now propagate instead of being silently swallowed. Broadcast failures are still rescued (non-critical). An `ensure` block in `FetchRunner#run` guarantees fetch_status resets from "fetching" to "failed" on any unexpected exit path. `FollowUpHandler` now rescues per-item scrape enqueue failures so one bad item doesn't block remaining enqueues.
30
+ - **Fixed-interval sources now get scheduling jitter.** Sources using fixed fetch intervals (not adaptive) now receive ±10% jitter on `next_fetch_at`, preventing thundering herd effects when many sources share the same interval.
31
+ - **ScheduleFetchesJob uses configured batch size.** The job's fallback limit now reads `config.fetching.scheduler_batch_size` (25) instead of the legacy `DEFAULT_BATCH_SIZE` constant (100).
32
+
33
+ ### Changed
34
+
35
+ - Default scheduler batch size reduced from 100 to 25 (configurable via `config.fetching.scheduler_batch_size`).
36
+ - Default stale fetch timeout reduced from 10 to 5 minutes (configurable via `config.fetching.stale_timeout_minutes`).
37
+ - 7 jobs moved from fetch queue to maintenance queue: `SourceHealthCheckJob`, `ImportSessionHealthCheckJob`, `ImportOpmlJob`, `LogCleanupJob`, `ItemCleanupJob`, `FaviconFetchJob`, `DownloadContentImagesJob`.
38
+
39
+ ### Testing
40
+
41
+ - 1,214 tests, 3,765 assertions, 0 failures.
42
+ - RuboCop: 0 offenses (424 files).
43
+ - Brakeman: 0 warnings.
44
+
18
45
  ## [0.9.1] - 2026-02-22
19
46
 
20
47
  ### Fixed
data/CLAUDE.md CHANGED
@@ -4,10 +4,8 @@
4
4
 
5
5
  ## Active Context
6
6
 
7
- **Milestone:** polish-and-reliability (extended)
8
- **Phase:** 4 of 5 -- Bug Fixes & Polish (pending planning)
9
- **Previous phases:** Backend Fixes, Favicon Support, Toast Stacking (all complete)
10
- **Next action:** /vbw:vibe to plan and execute Phase 4
7
+ **Last shipped:** polish-and-reliability (6 phases, 17 plans, 35 commits)
8
+ **Next action:** /vbw:vibe to start new work
11
9
 
12
10
  ## Key Decisions
13
11
 
data/Gemfile.lock CHANGED
@@ -1,7 +1,7 @@
1
1
  PATH
2
2
  remote: .
3
3
  specs:
4
- source_monitor (0.9.1)
4
+ source_monitor (0.10.0)
5
5
  cssbundling-rails (~> 1.4)
6
6
  faraday (~> 2.9)
7
7
  faraday-follow_redirects (~> 0.4)
data/README.md CHANGED
@@ -9,8 +9,8 @@ SourceMonitor is a production-ready Rails 8 mountable engine for ingesting, norm
9
9
  In your host Rails app:
10
10
 
11
11
  ```bash
12
- bundle add source_monitor --version "~> 0.7.1"
13
- # or add `gem "source_monitor", "~> 0.7.1"` manually, then run:
12
+ bundle add source_monitor --version "~> 0.10.0"
13
+ # or add `gem "source_monitor", "~> 0.10.0"` manually, then run:
14
14
  bundle install
15
15
  ```
16
16
 
@@ -43,7 +43,7 @@ This exposes `bin/source_monitor` (via Bundler binstubs) so you can run the guid
43
43
  Before running any SourceMonitor commands inside your host app, add the gem and install dependencies:
44
44
 
45
45
  ```bash
46
- bundle add source_monitor --version "~> 0.7.1"
46
+ bundle add source_monitor --version "~> 0.10.0"
47
47
  # or edit your Gemfile, then run
48
48
  bundle install
49
49
  ```
@@ -93,14 +93,14 @@ See [examples/README.md](examples/README.md) for usage instructions.
93
93
  - Fetch/scrape log viewers with HTTP status, duration, backtrace, and Solid Queue job references
94
94
 
95
95
  ## Background Jobs & Scheduling
96
- - Solid Queue becomes the Active Job adapter when the host app still uses the inline `:async` adapter; queue names default to `source_monitor_fetch` and `source_monitor_scrape` and honour `ActiveJob.queue_name_prefix`.
96
+ - Solid Queue becomes the Active Job adapter when the host app still uses the inline `:async` adapter. Three queues are used: `source_monitor_fetch` (FetchFeedJob, ScheduleFetchesJob), `source_monitor_scrape` (ScrapeItemJob), and `source_monitor_maintenance` (health checks, cleanup, favicon, images, OPML import). All honour `ActiveJob.queue_name_prefix`.
97
97
  - `config/recurring.yml` schedules minute-level fetches and scrapes. Run `bin/jobs --recurring_schedule_file=config/recurring.yml` (or set `SOLID_QUEUE_RECURRING_SCHEDULE_FILE`) to load recurring tasks. Disable with `SOLID_QUEUE_SKIP_RECURRING=true`.
98
- - Retry/backoff behaviour is driven by `SourceMonitor.configure.fetching`. Fetch completion events and item processors allow you to chain downstream workflows (indexing, notifications, etc.).
98
+ - Retry/backoff behaviour is driven by `SourceMonitor.configure.fetching`. Scheduler batch size (default 25) and stale fetch timeout (default 5 minutes) are configurable for small-server deployments. Fetch completion events and item processors allow you to chain downstream workflows (indexing, notifications, etc.).
99
99
 
100
100
  ## Configuration & API Surface
101
101
  The generated initializer documents every setting. Key areas:
102
102
 
103
- - Queue namespace/concurrency helpers (`SourceMonitor.queue_name(:fetch)`)
103
+ - Queue namespace/concurrency helpers (`SourceMonitor.queue_name(:fetch)`, `:scrape`, `:maintenance`)
104
104
  - HTTP, retry, and proxy settings (Faraday-backed)
105
105
  - Scraper registry (`config.scrapers.register(:my_adapter, "MyApp::Scrapers::Custom")`)
106
106
  - Retention defaults (`config.retention.items_retention_days`, `config.retention.strategy`)
data/VERSION CHANGED
@@ -1 +1 @@
1
- 0.9.1
1
+ 0.10.0
@@ -2,7 +2,7 @@
2
2
 
3
3
  module SourceMonitor
4
4
  class DownloadContentImagesJob < ApplicationJob
5
- source_monitor_queue :fetch
5
+ source_monitor_queue :maintenance
6
6
 
7
7
  discard_on ActiveJob::DeserializationError
8
8
 
@@ -2,7 +2,7 @@
2
2
 
3
3
  module SourceMonitor
4
4
  class FaviconFetchJob < ApplicationJob
5
- source_monitor_queue :fetch
5
+ source_monitor_queue :maintenance
6
6
 
7
7
  discard_on ActiveJob::DeserializationError
8
8
 
@@ -7,7 +7,7 @@ require "source_monitor/sources/params"
7
7
 
8
8
  module SourceMonitor
9
9
  class ImportOpmlJob < ApplicationJob
10
- source_monitor_queue :fetch
10
+ source_monitor_queue :maintenance
11
11
 
12
12
  discard_on ActiveJob::DeserializationError
13
13
 
@@ -2,7 +2,7 @@
2
2
 
3
3
  module SourceMonitor
4
4
  class ImportSessionHealthCheckJob < ApplicationJob
5
- source_monitor_queue :fetch
5
+ source_monitor_queue :maintenance
6
6
 
7
7
  require "source_monitor/health/import_source_health_check"
8
8
  require "source_monitor/import_sessions/entry_normalizer"
@@ -4,7 +4,7 @@ module SourceMonitor
4
4
  class ItemCleanupJob < ApplicationJob
5
5
  DEFAULT_BATCH_SIZE = 100
6
6
 
7
- source_monitor_queue :fetch
7
+ source_monitor_queue :maintenance
8
8
 
9
9
  def perform(options = nil)
10
10
  options = SourceMonitor::Jobs::CleanupOptions.normalize(options)
@@ -5,7 +5,7 @@ module SourceMonitor
5
5
  DEFAULT_FETCH_LOG_RETENTION_DAYS = 90
6
6
  DEFAULT_SCRAPE_LOG_RETENTION_DAYS = 45
7
7
 
8
- source_monitor_queue :fetch
8
+ source_monitor_queue :maintenance
9
9
 
10
10
  def perform(options = nil)
11
11
  options = SourceMonitor::Jobs::CleanupOptions.normalize(options)
@@ -23,7 +23,7 @@ module SourceMonitor
23
23
  options_hash = options_hash.symbolize_keys
24
24
  end
25
25
 
26
- options_hash[:limit] || SourceMonitor::Scheduler::DEFAULT_BATCH_SIZE
26
+ options_hash[:limit] || SourceMonitor.config.fetching.scheduler_batch_size
27
27
  end
28
28
  end
29
29
  end
@@ -2,7 +2,7 @@
2
2
 
3
3
  module SourceMonitor
4
4
  class SourceHealthCheckJob < ApplicationJob
5
- source_monitor_queue :fetch
5
+ source_monitor_queue :maintenance
6
6
 
7
7
  discard_on ActiveJob::DeserializationError
8
8
 
@@ -22,12 +22,15 @@ Restart your application whenever you change these settings. The engine reloads
22
22
  - `config.queue_namespace` – prefix applied to queue names (`"source_monitor"` by default)
23
23
  - `config.fetch_queue_name` / `config.scrape_queue_name` – base queue names before the host's `ActiveJob.queue_name_prefix` is applied
24
24
  - `config.fetch_queue_concurrency` / `config.scrape_queue_concurrency` – advisory values Solid Queue uses for per-queue limits
25
- - `config.queue_name_for(:fetch | :scrape)` – helper that respects the host's queue prefix
25
+ - `config.maintenance_queue_name` – queue name for maintenance jobs (`"source_monitor_maintenance"` by default)
26
+ - `config.maintenance_queue_concurrency` – advisory concurrency for the maintenance queue (default `1`)
27
+ - `config.queue_name_for(:fetch | :scrape | :maintenance)` – helper that respects the host's queue prefix
26
28
 
27
29
  Use the helpers exposed on `SourceMonitor`:
28
30
 
29
31
  ```ruby
30
- SourceMonitor.queue_name(:fetch) # => "source_monitor_fetch"
32
+ SourceMonitor.queue_name(:fetch) # => "source_monitor_fetch"
33
+ SourceMonitor.queue_name(:maintenance) # => "source_monitor_maintenance"
31
34
  SourceMonitor.queue_concurrency(:scrape) # => 2
32
35
  ```
33
36
 
@@ -59,6 +62,8 @@ The helper `SourceMonitor.mission_control_dashboard_path` performs a routing che
59
62
  - `increase_factor` / `decrease_factor` – multipliers when a source trends slow/fast
60
63
  - `failure_increase_factor` – multiplier applied on consecutive failures
61
64
  - `jitter_percent` – random jitter applied to next fetch time (0.1 = ±10%)
65
+ - `scheduler_batch_size` – max sources picked up per scheduler run (default `25`, was `100`)
66
+ - `stale_timeout_minutes` – minutes before a source stuck in "fetching" is reset (default `5`, was `10`)
62
67
 
63
68
  ## Retention Defaults
64
69
 
@@ -162,6 +167,10 @@ The engine honours several environment variables out of the box:
162
167
  - `SOLID_QUEUE_RECURRING_SCHEDULE_FILE` – alternative schedule file path
163
168
  - `SOFT_DELETE` / `SOURCE_IDS` / `SOURCE_ID` – overrides for item cleanup rake tasks
164
169
  - `FETCH_LOG_DAYS` / `SCRAPE_LOG_DAYS` – retention windows for log cleanup
170
+ - `WINDOW_MINUTES` – time window (minutes) for `stagger_fetch_times` rake task (default `10`)
171
+ - `SOURCE_MONITOR_FETCH_CONCURRENCY` – override fetch queue concurrency in `solid_queue.yml`
172
+ - `SOURCE_MONITOR_SCRAPE_CONCURRENCY` – override scrape queue concurrency in `solid_queue.yml`
173
+ - `SOURCE_MONITOR_MAINTENANCE_CONCURRENCY` – override maintenance queue concurrency in `solid_queue.yml`
165
174
 
166
175
  ## After Changing Configuration
167
176
 
data/docs/deployment.md CHANGED
@@ -16,7 +16,7 @@ This guide captures the production considerations for running SourceMonitor insi
16
16
  SourceMonitor assumes the standard Rails 8 process split:
17
17
 
18
18
  - **Web** – your application server (Puma) serving the mounted engine and Action Cable. When using Solid Cable, no separate Redis process is required.
19
- - **Worker** – at least one Solid Queue worker (`bin/rails solid_queue:start`). Scale horizontally to match feed volume and retention pruning needs. Use queue selectors if you dedicate workers to `source_monitor_fetch` or `source_monitor_scrape`.
19
+ - **Worker** – at least one Solid Queue worker (`bin/rails solid_queue:start`). Scale horizontally to match feed volume and retention pruning needs. The engine uses three queues: `source_monitor_fetch` (time-sensitive feed polling), `source_monitor_scrape` (content extraction), and `source_monitor_maintenance` (health checks, cleanup, favicon, images, OPML import). Use queue selectors if you dedicate workers to specific queues.
20
20
  - **Scheduler/Recurring** – optional process invoking `bin/jobs --recurring_schedule_file=config/recurring.yml` so the bundled recurring tasks enqueue fetch/scrape/cleanup jobs. Disable with `SOLID_QUEUE_SKIP_RECURRING=true` when another scheduler handles cron-style jobs.
21
21
 
22
22
  ## Database & Storage
@@ -41,6 +41,10 @@ SourceMonitor assumes the standard Rails 8 process split:
41
41
 
42
42
  - Increase `config.fetch_queue_concurrency` and the number of Solid Queue workers as source volume grows.
43
43
  - Adjust `config.fetching` multipliers to smooth out noisy feeds; raising `failure_increase_factor` slows retries for consistently failing sources.
44
+ - Tune `config.fetching.scheduler_batch_size` (default 25) to control how many sources are picked up per scheduler run. On larger servers, increase this to 50-100.
45
+ - The `config.fetching.stale_timeout_minutes` (default 5) controls how quickly stuck "fetching" sources are recovered. Lower values mean faster recovery but more aggressive reconciliation.
46
+ - After deploys or queue stalls where many sources become overdue simultaneously, run `bin/rails source_monitor:maintenance:stagger_fetch_times` to distribute them across a time window and prevent thundering herd.
47
+ - The maintenance queue (concurrency 1 by default) handles non-time-sensitive work. Scale independently of fetch/scrape via `config.maintenance_queue_concurrency` or `SOURCE_MONITOR_MAINTENANCE_CONCURRENCY` env var.
44
48
  - Use `config.retention` to cap database growth; nightly cleanup jobs can run on separate workers if pruning becomes heavy.
45
49
 
46
50
  ## Rolling Upgrades
data/docs/setup.md CHANGED
@@ -18,8 +18,8 @@ This guide consolidates the new guided installer, verification commands, and rol
18
18
  Run these commands inside your host Rails application before invoking the guided workflow:
19
19
 
20
20
  ```bash
21
- bundle add source_monitor --version "~> 0.3.1"
22
- # or add gem "source_monitor", "~> 0.3.1" to Gemfile manually
21
+ bundle add source_monitor --version "~> 0.10.0"
22
+ # or add gem "source_monitor", "~> 0.10.0" to Gemfile manually
23
23
  bundle install
24
24
  ```
25
25
 
@@ -58,38 +58,52 @@ This guide lists common issues you might encounter while installing, upgrading,
58
58
  - When switching to Redis, add `config.realtime.adapter = :redis` and `config.realtime.redis_url` in the initializer, then restart web and worker processes.
59
59
  - For Solid Cable, check that the `solid_cable_messages` table exists and that no other process clears it unexpectedly.
60
60
 
61
- ## 7. Fetch Jobs Keep Failing
61
+ ## 7. Sources Show "Overdue" on Dashboard
62
+
63
+ - **Symptoms:** Many sources show as overdue on the dashboard, especially after deploys or on sites with hundreds of sources.
64
+ - **Thundering herd:** If many sources became due simultaneously (e.g., after a queue stall), they overwhelm the scheduler's per-run batch size (default 25). Run the stagger task to spread them out:
65
+ ```bash
66
+ bin/rails source_monitor:maintenance:stagger_fetch_times WINDOW_MINUTES=10
67
+ ```
68
+ - **Stuck "fetching" sources:** The stalled fetch reconciler automatically resets sources stuck in "fetching" status after `config.fetching.stale_timeout_minutes` (default 5 minutes). For manual recovery:
69
+ ```bash
70
+ bin/rails source_monitor:maintenance:recover_stalled_fetches
71
+ ```
72
+ - **Batch size too small:** If you have hundreds of sources, the default batch size of 25 may cause a backlog. Increase via `config.fetching.scheduler_batch_size = 50` in your initializer.
73
+ - **Queue separation:** Ensure your `solid_queue.yml` includes all three SourceMonitor queues (`source_monitor_fetch`, `source_monitor_scrape`, `source_monitor_maintenance`). Non-fetch jobs on the wrong queue can starve fetch processing.
74
+
75
+ ## 8. Fetch Jobs Keep Failing
62
76
 
63
77
  - Review the most recent fetch log entry for the source; it stores the HTTP status, error class, and error message.
64
78
  - Increase `config.http.timeout` or `config.http.retry_max` if the feed is slow or prone to transient errors.
65
79
  - Supply custom headers or basic auth credentials via the source form when feeds require authentication.
66
80
  - Check for TLS issues on self-signed feeds; you may need to configure Faraday with custom SSL options.
67
81
 
68
- ## 8. Scraping Returns "Failed"
82
+ ## 9. Scraping Returns "Failed"
69
83
 
70
84
  - Confirm the source has scraping enabled and the configured adapter exists.
71
85
  - Override selectors in the source's scrape settings if the default Readability extraction misses key elements.
72
86
  - Inspect the scrape log to see the adapter status and content length. Logs store the HTTP status and any exception raised by the adapter.
73
87
  - Retry manually from the item detail page after fixing selectors.
74
88
 
75
- ## 9. Cleanup Rake Tasks Fail
89
+ ## 10. Cleanup Rake Tasks Fail
76
90
 
77
91
  - Pass numeric values for `FETCH_LOG_DAYS` or `SCRAPE_LOG_DAYS` environment variables (e.g., `FETCH_LOG_DAYS=30`).
78
92
  - Ensure workers or the console environment have permission to soft delete (`SOFT_DELETE=true`) if you expect tombstones.
79
93
  - If job classes cannot load, verify `SourceMonitor.configure` ran before calling `rake source_monitor:cleanup:*`.
80
94
 
81
- ## 10. Test Suite Cannot Launch a Browser
95
+ ## 11. Test Suite Cannot Launch a Browser
82
96
 
83
97
  - System tests rely on Selenium + Chrome. Install Chrome/Chromium and set `SELENIUM_CHROME_BINARY` if the binary lives in a non-standard path.
84
98
  - You can run `rbenv exec bin/test-coverage --verbose` to inspect failures with additional logging.
85
99
 
86
- ## 11. Mission Control Jobs Link Returns 404
100
+ ## 12. Mission Control Jobs Link Returns 404
87
101
 
88
102
  - Mount `MissionControl::Jobs::Engine` in your host routes (for example, `mount MissionControl::Jobs::Engine, at: "/mission_control"`).
89
103
  - Keep `config.mission_control_enabled = true` **and** `config.mission_control_dashboard_path` pointing at that mounted route helper. Call `SourceMonitor.mission_control_dashboard_path` in the Rails console to confirm it resolves.
90
104
  - When hosting Mission Control in a separate app, provide a full URL instead of a route helper and ensure CORS/WebSocket settings allow the dashboard iframe.
91
105
 
92
- ## 12. Tailwind Build Fails or Admin UI Loads Without Styles
106
+ ## 13. Tailwind Build Fails or Admin UI Loads Without Styles
93
107
 
94
108
  - Running `test/dummy/bin/dev` before configuring the bundling pipeline will serve the admin UI without Tailwind styles or Stimulus behaviours. This happens because the engine no longer ships precompiled assets; see `.ai/engine-asset-configuration.md:11-44` for the required npm setup.
95
109
  - Fix by running `npm install` followed by `npm run build` inside the engine root so that `app/assets/builds/source_monitor/application.css` and `application.js` exist. The Rake task `app:source_monitor:assets:build` wraps the same scripts for CI usage.
data/docs/upgrade.md CHANGED
@@ -46,6 +46,33 @@ If a removed option raises an error (`SourceMonitor::DeprecatedOptionError`), yo
46
46
 
47
47
  ## Version-Specific Notes
48
48
 
49
+ ### Upgrading to 0.10.0 (from 0.9.x)
50
+
51
+ **What changed:**
52
+ - New third queue: `source_monitor_maintenance` separates non-fetch jobs from the fetch pipeline. Health checks, cleanup, favicon, image download, and OPML import jobs now use the maintenance queue.
53
+ - Scheduler batch size configurable via `config.fetching.scheduler_batch_size` (default reduced from 100 to 25).
54
+ - Stale fetch timeout configurable via `config.fetching.stale_timeout_minutes` (default reduced from 10 to 5).
55
+ - Fixed-interval sources now receive ±10% jitter on `next_fetch_at`.
56
+ - Fetch pipeline error handling hardened: DB errors propagate, broadcast errors are still rescued, `ensure` block guarantees status reset.
57
+ - New rake task: `source_monitor:maintenance:stagger_fetch_times` distributes overdue sources across a time window.
58
+
59
+ **Upgrade steps:**
60
+ ```bash
61
+ bundle update source_monitor
62
+ bin/rails source_monitor:upgrade
63
+ bin/rails db:migrate
64
+ ```
65
+
66
+ **Notes:**
67
+ - **Action required:** Update your `solid_queue.yml` to include the new maintenance queue. Add:
68
+ ```yaml
69
+ source_monitor_maintenance:
70
+ concurrency: <%= ENV.fetch("SOURCE_MONITOR_MAINTENANCE_CONCURRENCY", 1) %>
71
+ ```
72
+ - If you have many sources that are overdue after upgrading, run `bin/rails source_monitor:maintenance:stagger_fetch_times` to break the thundering herd.
73
+ - The default batch size (25) and stale timeout (5 min) are tuned for 1-CPU/2GB servers. Scale up via `config.fetching.scheduler_batch_size` and `config.fetching.stale_timeout_minutes` for larger deployments.
74
+ - No breaking changes to public API. All existing initializer configuration remains valid.
75
+
49
76
  ### Upgrading to 0.8.0 (from 0.7.x)
50
77
 
51
78
  **What changed:**
@@ -8,7 +8,9 @@ module SourceMonitor
8
8
  :increase_factor,
9
9
  :decrease_factor,
10
10
  :failure_increase_factor,
11
- :jitter_percent
11
+ :jitter_percent,
12
+ :scheduler_batch_size,
13
+ :stale_timeout_minutes
12
14
 
13
15
  def initialize
14
16
  reset!
@@ -21,6 +23,8 @@ module SourceMonitor
21
23
  @decrease_factor = 0.75
22
24
  @failure_increase_factor = 1.5
23
25
  @jitter_percent = 0.1
26
+ @scheduler_batch_size = 25
27
+ @stale_timeout_minutes = 5
24
28
  end
25
29
  end
26
30
  end
@@ -22,8 +22,10 @@ module SourceMonitor
22
22
  attr_accessor :queue_namespace,
23
23
  :fetch_queue_name,
24
24
  :scrape_queue_name,
25
+ :maintenance_queue_name,
25
26
  :fetch_queue_concurrency,
26
27
  :scrape_queue_concurrency,
28
+ :maintenance_queue_concurrency,
27
29
  :recurring_command_job_class,
28
30
  :job_metrics_enabled,
29
31
  :mission_control_enabled,
@@ -37,8 +39,10 @@ module SourceMonitor
37
39
  @queue_namespace = DEFAULT_QUEUE_NAMESPACE
38
40
  @fetch_queue_name = "#{DEFAULT_QUEUE_NAMESPACE}_fetch"
39
41
  @scrape_queue_name = "#{DEFAULT_QUEUE_NAMESPACE}_scrape"
42
+ @maintenance_queue_name = "#{DEFAULT_QUEUE_NAMESPACE}_maintenance"
40
43
  @fetch_queue_concurrency = 2
41
44
  @scrape_queue_concurrency = 2
45
+ @maintenance_queue_concurrency = 1
42
46
  @recurring_command_job_class = nil
43
47
  @job_metrics_enabled = true
44
48
  @mission_control_enabled = false
@@ -64,6 +68,8 @@ module SourceMonitor
64
68
  fetch_queue_name
65
69
  when :scrape
66
70
  scrape_queue_name
71
+ when :maintenance
72
+ maintenance_queue_name
67
73
  else
68
74
  raise ArgumentError, "unknown queue role #{role.inspect}"
69
75
  end
@@ -84,6 +90,8 @@ module SourceMonitor
84
90
  fetch_queue_concurrency
85
91
  when :scrape
86
92
  scrape_queue_concurrency
93
+ when :maintenance
94
+ maintenance_queue_concurrency
87
95
  else
88
96
  raise ArgumentError, "unknown queue role #{role.inspect}"
89
97
  end
@@ -16,7 +16,13 @@ module SourceMonitor
16
16
  Array(result.item_processing&.created_items).each do |item|
17
17
  next unless item.present? && item.scraped_at.nil?
18
18
 
19
- enqueuer_class.enqueue(item:, source:, job_class:, reason: :auto)
19
+ begin
20
+ enqueuer_class.enqueue(item:, source:, job_class:, reason: :auto)
21
+ rescue StandardError => error
22
+ Rails.logger.error(
23
+ "[SourceMonitor] FollowUpHandler: failed to enqueue scrape for item #{item.id}: #{error.class}: #{error.message}"
24
+ ) if defined?(Rails) && Rails.respond_to?(:logger) && Rails.logger
25
+ end
20
26
  end
21
27
  end
22
28
 
@@ -29,7 +29,8 @@ module SourceMonitor
29
29
  attributes[:backoff_until] = failure ? scheduled_time : nil
30
30
  else
31
31
  fixed_minutes = [ source.fetch_interval_minutes.to_i, 1 ].max
32
- attributes[:next_fetch_at] = Time.current + fixed_minutes.minutes
32
+ fixed_seconds = fixed_minutes * 60.0
33
+ attributes[:next_fetch_at] = Time.current + adjusted_interval_with_jitter(fixed_seconds)
33
34
  attributes[:backoff_until] = nil
34
35
  end
35
36
  end
@@ -69,6 +69,13 @@ module SourceMonitor
69
69
  mark_failed!(error)
70
70
  event_publisher.call(source:, result: nil)
71
71
  raise
72
+ ensure
73
+ begin
74
+ source.reload
75
+ source.update!(fetch_status: "failed") if source.fetch_status == "fetching"
76
+ rescue StandardError # :nocov:
77
+ nil
78
+ end
72
79
  end
73
80
 
74
81
  private
@@ -82,11 +89,13 @@ module SourceMonitor
82
89
 
83
90
  def self.update_source_state!(source, attrs)
84
91
  source.update!(attrs)
85
- SourceMonitor::Realtime.broadcast_source(source)
86
- rescue StandardError => error
87
- Rails.logger.error(
88
- "[SourceMonitor] Failed to update fetch state for source #{source.id}: #{error.class}: #{error.message}"
89
- ) if defined?(Rails) && Rails.respond_to?(:logger) && Rails.logger
92
+ begin
93
+ SourceMonitor::Realtime.broadcast_source(source)
94
+ rescue StandardError => error
95
+ Rails.logger.error(
96
+ "[SourceMonitor] Failed to broadcast source #{source.id}: #{error.class}: #{error.message}"
97
+ ) if defined?(Rails) && Rails.respond_to?(:logger) && Rails.logger
98
+ end
90
99
  end
91
100
  private_class_method :update_source_state!
92
101
 
@@ -43,11 +43,9 @@ module SourceMonitor
43
43
  attr_reader :now, :stale_after
44
44
 
45
45
  def self.default_stale_after
46
- if defined?(SourceMonitor::Scheduler::STALE_QUEUE_TIMEOUT)
47
- SourceMonitor::Scheduler::STALE_QUEUE_TIMEOUT
48
- else
49
- 10.minutes
50
- end
46
+ SourceMonitor.config.fetching.stale_timeout_minutes.minutes
47
+ rescue NoMethodError
48
+ 10.minutes
51
49
  end
52
50
 
53
51
  def stale_sources
@@ -5,11 +5,11 @@ require "source_monitor/fetching/stalled_fetch_reconciler"
5
5
 
6
6
  module SourceMonitor
7
7
  class Scheduler
8
- DEFAULT_BATCH_SIZE = 100
9
- STALE_QUEUE_TIMEOUT = 10.minutes
8
+ DEFAULT_BATCH_SIZE = 100 # legacy fallback
9
+ STALE_QUEUE_TIMEOUT = 10.minutes # legacy fallback
10
10
  ELIGIBLE_FETCH_STATUSES = %w[idle failed].freeze
11
11
 
12
- def self.run(limit: DEFAULT_BATCH_SIZE, now: Time.current)
12
+ def self.run(limit: SourceMonitor.config.fetching.scheduler_batch_size, now: Time.current)
13
13
  new(limit:, now:).run
14
14
  end
15
15
 
@@ -20,7 +20,7 @@ module SourceMonitor
20
20
 
21
21
  def run
22
22
  payload = { limit: limit }
23
- recovery = SourceMonitor::Fetching::StalledFetchReconciler.call(now:, stale_after: STALE_QUEUE_TIMEOUT)
23
+ recovery = SourceMonitor::Fetching::StalledFetchReconciler.call(now:, stale_after: stale_timeout)
24
24
  payload[:stalled_recoveries] = recovery.recovered_source_ids.size
25
25
  payload[:stalled_jobs_removed] = recovery.jobs_removed.size
26
26
 
@@ -43,6 +43,10 @@ module SourceMonitor
43
43
 
44
44
  attr_reader :limit, :now
45
45
 
46
+ def stale_timeout
47
+ SourceMonitor.config.fetching.stale_timeout_minutes.minutes
48
+ end
49
+
46
50
  def lock_due_source_ids
47
51
  ids = []
48
52
 
@@ -72,7 +76,7 @@ module SourceMonitor
72
76
  table = SourceMonitor::Source.arel_table
73
77
 
74
78
  eligible = table[:fetch_status].in(ELIGIBLE_FETCH_STATUSES)
75
- stale_cutoff = now - STALE_QUEUE_TIMEOUT
79
+ stale_cutoff = now - stale_timeout
76
80
  stale_queued = table[:fetch_status].eq("queued").and(table[:updated_at].lteq(stale_cutoff))
77
81
  stale_fetching = table[:fetch_status].eq("fetching").and(table[:last_fetch_started_at].lteq(stale_cutoff))
78
82
 
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module SourceMonitor
4
- VERSION = "0.9.1"
4
+ VERSION = "0.10.0"
5
5
  end
@@ -0,0 +1,37 @@
1
+ # frozen_string_literal: true
2
+
3
+ namespace :source_monitor do
4
+ namespace :maintenance do
5
+ desc "Spread due sources' next_fetch_at across a time window to break thundering herd"
6
+ task stagger_fetch_times: :environment do
7
+ window_minutes = (ENV["WINDOW_MINUTES"] || 10).to_i
8
+ window_seconds = window_minutes * 60.0
9
+
10
+ sources = SourceMonitor::Source
11
+ .active
12
+ .where(fetch_status: %w[idle failed])
13
+ .where(
14
+ SourceMonitor::Source.arel_table[:next_fetch_at].eq(nil).or(
15
+ SourceMonitor::Source.arel_table[:next_fetch_at].lteq(Time.current)
16
+ )
17
+ )
18
+ .order(:id)
19
+
20
+ count = sources.count
21
+
22
+ if count.zero?
23
+ puts "No sources need staggering."
24
+ else
25
+ now = Time.current
26
+ step = count > 1 ? window_seconds / (count - 1).to_f : 0.0
27
+
28
+ sources.find_each.with_index do |source, index|
29
+ offset = step * index
30
+ source.update_columns(next_fetch_at: now + offset)
31
+ end
32
+
33
+ puts "Staggered #{count} sources across #{window_minutes} minutes."
34
+ end
35
+ end
36
+ end
37
+ end
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: source_monitor
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.9.1
4
+ version: 0.10.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - dchuk
@@ -635,6 +635,7 @@ files:
635
635
  - lib/tasks/source_monitor_assets.rake
636
636
  - lib/tasks/source_monitor_setup.rake
637
637
  - lib/tasks/source_monitor_tasks.rake
638
+ - lib/tasks/stagger_fetch_times.rake
638
639
  - lib/tasks/test_fast.rake
639
640
  - lib/tasks/test_smoke.rake
640
641
  - package-lock.json
@@ -693,7 +694,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
693
694
  - !ruby/object:Gem::Version
694
695
  version: '0'
695
696
  requirements: []
696
- rubygems_version: 4.0.3
697
+ rubygems_version: 4.0.6
697
698
  specification_version: 4
698
699
  summary: SourceMonitor engine for ingesting, scraping, and monitoring RSS/Atom/JSON
699
700
  feeds