source_monitor 0.3.0 → 0.3.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/.claude/skills/sm-architecture/SKILL.md +233 -0
- data/.claude/skills/sm-architecture/reference/extraction-patterns.md +192 -0
- data/.claude/skills/sm-architecture/reference/module-map.md +194 -0
- data/.claude/skills/sm-configuration-setting/SKILL.md +264 -0
- data/.claude/skills/sm-configuration-setting/reference/settings-catalog.md +248 -0
- data/.claude/skills/sm-configuration-setting/reference/settings-pattern.md +297 -0
- data/.claude/skills/sm-configure/SKILL.md +153 -0
- data/.claude/skills/sm-configure/reference/configuration-reference.md +321 -0
- data/.claude/skills/sm-dashboard-widget/SKILL.md +344 -0
- data/.claude/skills/sm-dashboard-widget/reference/dashboard-patterns.md +304 -0
- data/.claude/skills/sm-domain-model/SKILL.md +188 -0
- data/.claude/skills/sm-domain-model/reference/model-graph.md +114 -0
- data/.claude/skills/sm-domain-model/reference/table-structure.md +348 -0
- data/.claude/skills/sm-engine-migration/SKILL.md +395 -0
- data/.claude/skills/sm-engine-migration/reference/migration-conventions.md +255 -0
- data/.claude/skills/sm-engine-test/SKILL.md +302 -0
- data/.claude/skills/sm-engine-test/reference/test-helpers.md +259 -0
- data/.claude/skills/sm-engine-test/reference/test-patterns.md +411 -0
- data/.claude/skills/sm-event-handler/SKILL.md +265 -0
- data/.claude/skills/sm-event-handler/reference/events-api.md +229 -0
- data/.claude/skills/sm-health-rule/SKILL.md +327 -0
- data/.claude/skills/sm-health-rule/reference/health-system.md +269 -0
- data/.claude/skills/sm-host-setup/SKILL.md +223 -0
- data/.claude/skills/sm-host-setup/reference/initializer-template.md +195 -0
- data/.claude/skills/sm-host-setup/reference/setup-checklist.md +134 -0
- data/.claude/skills/sm-job/SKILL.md +263 -0
- data/.claude/skills/sm-job/reference/job-conventions.md +245 -0
- data/.claude/skills/sm-model-extension/SKILL.md +287 -0
- data/.claude/skills/sm-model-extension/reference/extension-api.md +317 -0
- data/.claude/skills/sm-pipeline-stage/SKILL.md +254 -0
- data/.claude/skills/sm-pipeline-stage/reference/completion-handlers.md +152 -0
- data/.claude/skills/sm-pipeline-stage/reference/entry-processing.md +191 -0
- data/.claude/skills/sm-pipeline-stage/reference/feed-fetcher-architecture.md +198 -0
- data/.claude/skills/sm-scraper-adapter/SKILL.md +284 -0
- data/.claude/skills/sm-scraper-adapter/reference/adapter-contract.md +167 -0
- data/.claude/skills/sm-scraper-adapter/reference/example-adapter.md +274 -0
- data/.vbw-planning/.notification-log.jsonl +102 -0
- data/.vbw-planning/.session-log.jsonl +505 -0
- data/AGENTS.md +20 -57
- data/CHANGELOG.md +19 -0
- data/CLAUDE.md +44 -1
- data/CONTRIBUTING.md +5 -5
- data/Gemfile.lock +20 -21
- data/README.md +18 -5
- data/VERSION +1 -0
- data/docs/deployment.md +1 -1
- data/docs/setup.md +4 -4
- data/lib/source_monitor/setup/skills_installer.rb +94 -0
- data/lib/source_monitor/setup/workflow.rb +17 -2
- data/lib/source_monitor/version.rb +1 -1
- data/lib/tasks/source_monitor_setup.rake +58 -0
- data/source_monitor.gemspec +1 -0
- metadata +39 -1
|
@@ -0,0 +1,134 @@
|
|
|
1
|
+
# Host App Setup Checklist
|
|
2
|
+
|
|
3
|
+
Step-by-step checklist for integrating SourceMonitor into a host Rails application.
|
|
4
|
+
|
|
5
|
+
## Phase 1: Prerequisites
|
|
6
|
+
|
|
7
|
+
- [ ] **Ruby 3.4+** installed (`ruby -v`)
|
|
8
|
+
- [ ] **Rails 8.0+** host application (`bin/rails about`)
|
|
9
|
+
- [ ] **PostgreSQL 14+** running and accessible
|
|
10
|
+
- [ ] **Node.js 18+** installed (for Tailwind/esbuild assets)
|
|
11
|
+
- [ ] **Solid Queue** in host Gemfile (`gem "solid_queue"`)
|
|
12
|
+
- [ ] **Solid Cable** or Redis available for Action Cable
|
|
13
|
+
|
|
14
|
+
## Phase 2: Install the Gem
|
|
15
|
+
|
|
16
|
+
```bash
|
|
17
|
+
# Option A: Released version
|
|
18
|
+
bundle add source_monitor --version "~> 0.3.0"
|
|
19
|
+
|
|
20
|
+
# Option B: GitHub edge
|
|
21
|
+
# Add to Gemfile: gem "source_monitor", github: "dchuk/source_monitor"
|
|
22
|
+
|
|
23
|
+
bundle install
|
|
24
|
+
```
|
|
25
|
+
|
|
26
|
+
- [ ] Gem added to Gemfile
|
|
27
|
+
- [ ] `bundle install` succeeds
|
|
28
|
+
|
|
29
|
+
## Phase 3: Run the Generator
|
|
30
|
+
|
|
31
|
+
```bash
|
|
32
|
+
# Guided (recommended)
|
|
33
|
+
bin/source_monitor install
|
|
34
|
+
|
|
35
|
+
# Manual
|
|
36
|
+
bin/rails generate source_monitor:install --mount-path=/source_monitor
|
|
37
|
+
```
|
|
38
|
+
|
|
39
|
+
Verify after running:
|
|
40
|
+
- [ ] `config/routes.rb` contains `mount SourceMonitor::Engine, at: "/source_monitor"`
|
|
41
|
+
- [ ] `config/initializers/source_monitor.rb` exists
|
|
42
|
+
|
|
43
|
+
## Phase 4: Database Setup
|
|
44
|
+
|
|
45
|
+
```bash
|
|
46
|
+
# Copy engine migrations to host
|
|
47
|
+
bin/rails railties:install:migrations FROM=source_monitor
|
|
48
|
+
|
|
49
|
+
# Apply all pending migrations
|
|
50
|
+
bin/rails db:migrate
|
|
51
|
+
```
|
|
52
|
+
|
|
53
|
+
- [ ] Engine migrations copied (check `db/migrate/` for `sourcemon_*` tables)
|
|
54
|
+
- [ ] `bin/rails db:migrate` succeeds
|
|
55
|
+
- [ ] Tables created: `sourcemon_sources`, `sourcemon_items`, `sourcemon_fetch_logs`, etc.
|
|
56
|
+
|
|
57
|
+
## Phase 5: Configure Authentication
|
|
58
|
+
|
|
59
|
+
Edit `config/initializers/source_monitor.rb`:
|
|
60
|
+
|
|
61
|
+
```ruby
|
|
62
|
+
SourceMonitor.configure do |config|
|
|
63
|
+
# Devise example
|
|
64
|
+
config.authentication.authenticate_with :authenticate_user!
|
|
65
|
+
config.authentication.authorize_with ->(controller) {
|
|
66
|
+
controller.current_user&.admin?
|
|
67
|
+
}
|
|
68
|
+
config.authentication.current_user_method = :current_user
|
|
69
|
+
config.authentication.user_signed_in_method = :user_signed_in?
|
|
70
|
+
end
|
|
71
|
+
```
|
|
72
|
+
|
|
73
|
+
- [ ] Authentication hook configured
|
|
74
|
+
- [ ] Authorization hook configured (if needed)
|
|
75
|
+
|
|
76
|
+
## Phase 6: Configure Workers
|
|
77
|
+
|
|
78
|
+
Ensure `config/solid_queue.yml` (or equivalent) includes the SourceMonitor queues:
|
|
79
|
+
|
|
80
|
+
```yaml
|
|
81
|
+
# config/solid_queue.yml
|
|
82
|
+
production:
|
|
83
|
+
dispatchers:
|
|
84
|
+
- polling_interval: 1
|
|
85
|
+
batch_size: 500
|
|
86
|
+
workers:
|
|
87
|
+
- queues: "source_monitor_fetch"
|
|
88
|
+
threads: 2
|
|
89
|
+
processes: 1
|
|
90
|
+
- queues: "source_monitor_scrape"
|
|
91
|
+
threads: 2
|
|
92
|
+
processes: 1
|
|
93
|
+
```
|
|
94
|
+
|
|
95
|
+
```bash
|
|
96
|
+
bin/rails solid_queue:start
|
|
97
|
+
```
|
|
98
|
+
|
|
99
|
+
- [ ] Queue configuration includes `source_monitor_fetch` and `source_monitor_scrape`
|
|
100
|
+
- [ ] Workers started and processing
|
|
101
|
+
|
|
102
|
+
## Phase 7: Verify Installation
|
|
103
|
+
|
|
104
|
+
```bash
|
|
105
|
+
bin/source_monitor verify
|
|
106
|
+
```
|
|
107
|
+
|
|
108
|
+
- [ ] Verification passes (exit code 0)
|
|
109
|
+
- [ ] Dashboard loads at mount path (e.g., `http://localhost:3000/source_monitor`)
|
|
110
|
+
- [ ] Create a test source, trigger "Fetch Now", confirm items appear
|
|
111
|
+
|
|
112
|
+
## Phase 8: Optional Configuration
|
|
113
|
+
|
|
114
|
+
- [ ] HTTP client tuned (timeouts, proxy, retries)
|
|
115
|
+
- [ ] Fetching intervals configured for your workload
|
|
116
|
+
- [ ] Health thresholds adjusted
|
|
117
|
+
- [ ] Retention policy set
|
|
118
|
+
- [ ] Custom scraper adapters registered
|
|
119
|
+
- [ ] Event callbacks wired for host integration
|
|
120
|
+
- [ ] Realtime adapter confirmed (Solid Cable or Redis)
|
|
121
|
+
- [ ] Mission Control integration enabled (if desired)
|
|
122
|
+
|
|
123
|
+
## Troubleshooting
|
|
124
|
+
|
|
125
|
+
| Problem | Solution |
|
|
126
|
+
|---|---|
|
|
127
|
+
| `bin/source_monitor` not found | Run `bundle install`, ensure gem is loaded |
|
|
128
|
+
| Migrations fail | Check for duplicate Solid Queue migrations, remove dupes |
|
|
129
|
+
| Dashboard 404 | Verify mount in `config/routes.rb`, restart server |
|
|
130
|
+
| Jobs not processing | Start Solid Queue workers, check queue names match config |
|
|
131
|
+
| Action Cable errors | Verify Solid Cable or Redis is configured in `cable.yml` |
|
|
132
|
+
| Auth redirect loop | Check `authenticate_with` matches your auth system |
|
|
133
|
+
|
|
134
|
+
See `docs/troubleshooting.md` for comprehensive fixes.
|
|
@@ -0,0 +1,263 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: sm-job
|
|
3
|
+
description: Solid Queue job conventions for the SourceMonitor engine. Use when creating new background jobs, modifying existing jobs, configuring queues, or working with job scheduling and retry policies.
|
|
4
|
+
allowed-tools: Read, Write, Edit, Bash, Glob, Grep
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
# SourceMonitor Job Development
|
|
8
|
+
|
|
9
|
+
## Overview
|
|
10
|
+
|
|
11
|
+
SourceMonitor uses Solid Queue (Rails 8 default) for background processing. All jobs inherit from `SourceMonitor::ApplicationJob` and use engine-namespaced queues.
|
|
12
|
+
|
|
13
|
+
## Queue Architecture
|
|
14
|
+
|
|
15
|
+
| Queue Role | Default Name | Jobs |
|
|
16
|
+
|------------|-------------|------|
|
|
17
|
+
| `:fetch` | `source_monitor_fetch` | FetchFeedJob, ScheduleFetchesJob, ItemCleanupJob, LogCleanupJob, SourceHealthCheckJob, ImportOpmlJob, ImportSessionHealthCheckJob |
|
|
18
|
+
| `:scrape` | `source_monitor_scrape` | ScrapeItemJob |
|
|
19
|
+
|
|
20
|
+
Queue names respect the host app's `ActiveJob::Base.queue_name_prefix` and `queue_name_delimiter`.
|
|
21
|
+
|
|
22
|
+
## Existing Jobs
|
|
23
|
+
|
|
24
|
+
| Job | Queue | Purpose | Pattern |
|
|
25
|
+
|-----|-------|---------|---------|
|
|
26
|
+
| `FetchFeedJob` | `:fetch` | Fetches a single source's feed | Delegates to `FetchRunner` |
|
|
27
|
+
| `ScheduleFetchesJob` | `:fetch` | Batch-enqueues due fetches | Delegates to `Scheduler.run` |
|
|
28
|
+
| `ScrapeItemJob` | `:scrape` | Scrapes a single item's URL | Delegates to `Scraping::ItemScraper` |
|
|
29
|
+
| `ItemCleanupJob` | `:fetch` | Prunes items by retention policy | Delegates to `RetentionPruner` |
|
|
30
|
+
| `LogCleanupJob` | `:fetch` | Removes old fetch/scrape logs | Direct SQL batches |
|
|
31
|
+
| `SourceHealthCheckJob` | `:fetch` | Runs health check on a source | Delegates to `Health::SourceHealthCheck` |
|
|
32
|
+
| `ImportOpmlJob` | `:fetch` | Imports sources from OPML | Delegates to source creation |
|
|
33
|
+
| `ImportSessionHealthCheckJob` | `:fetch` | Health-checks import candidates | Delegates to `Health::ImportSourceHealthCheck` |
|
|
34
|
+
|
|
35
|
+
## Key Conventions
|
|
36
|
+
|
|
37
|
+
### 1. Shallow Jobs
|
|
38
|
+
|
|
39
|
+
Jobs contain **only** deserialization + delegation. No business logic lives in job classes.
|
|
40
|
+
|
|
41
|
+
```ruby
|
|
42
|
+
# CORRECT -- shallow delegation
|
|
43
|
+
def perform(source_id)
|
|
44
|
+
source = SourceMonitor::Source.find_by(id: source_id)
|
|
45
|
+
return unless source
|
|
46
|
+
SourceMonitor::Fetching::FetchRunner.new(source: source).run
|
|
47
|
+
end
|
|
48
|
+
|
|
49
|
+
# WRONG -- business logic in job
|
|
50
|
+
def perform(source_id)
|
|
51
|
+
source = SourceMonitor::Source.find(source_id)
|
|
52
|
+
response = Faraday.get(source.feed_url) # Don't do this
|
|
53
|
+
feed = Feedjira.parse(response.body) # Business logic belongs elsewhere
|
|
54
|
+
end
|
|
55
|
+
```
|
|
56
|
+
|
|
57
|
+
### 2. Queue Declaration
|
|
58
|
+
|
|
59
|
+
Use the `source_monitor_queue` class method (not `queue_as`):
|
|
60
|
+
|
|
61
|
+
```ruby
|
|
62
|
+
class MyJob < SourceMonitor::ApplicationJob
|
|
63
|
+
source_monitor_queue :fetch # Uses SourceMonitor.queue_name(:fetch)
|
|
64
|
+
end
|
|
65
|
+
```
|
|
66
|
+
|
|
67
|
+
This ensures the queue name respects engine configuration and host app prefixes.
|
|
68
|
+
|
|
69
|
+
### 3. ID-Based Arguments
|
|
70
|
+
|
|
71
|
+
Pass record IDs, not Active Record objects. Guard against missing records:
|
|
72
|
+
|
|
73
|
+
```ruby
|
|
74
|
+
def perform(source_id)
|
|
75
|
+
source = SourceMonitor::Source.find_by(id: source_id)
|
|
76
|
+
return unless source # Silently skip if deleted
|
|
77
|
+
# ...
|
|
78
|
+
end
|
|
79
|
+
```
|
|
80
|
+
|
|
81
|
+
### 4. Error Handling
|
|
82
|
+
|
|
83
|
+
Use ActiveJob's built-in error handling:
|
|
84
|
+
|
|
85
|
+
```ruby
|
|
86
|
+
discard_on ActiveJob::DeserializationError # Record deleted between enqueue and perform
|
|
87
|
+
retry_on SomeTransientError, wait: 30.seconds, attempts: 5
|
|
88
|
+
```
|
|
89
|
+
|
|
90
|
+
### 5. Logging Pattern
|
|
91
|
+
|
|
92
|
+
Use structured logging with a consistent format:
|
|
93
|
+
|
|
94
|
+
```ruby
|
|
95
|
+
def log(stage, **extra)
|
|
96
|
+
return unless defined?(Rails) && Rails.respond_to?(:logger) && Rails.logger
|
|
97
|
+
payload = { stage: "SourceMonitor::MyJob##{stage}", **extra }.compact
|
|
98
|
+
Rails.logger.info("[SourceMonitor::MyJob] #{payload.to_json}")
|
|
99
|
+
rescue StandardError
|
|
100
|
+
nil
|
|
101
|
+
end
|
|
102
|
+
```
|
|
103
|
+
|
|
104
|
+
## Creating a New Job
|
|
105
|
+
|
|
106
|
+
### Template
|
|
107
|
+
|
|
108
|
+
```ruby
|
|
109
|
+
# app/jobs/source_monitor/my_new_job.rb
|
|
110
|
+
# frozen_string_literal: true
|
|
111
|
+
|
|
112
|
+
module SourceMonitor
|
|
113
|
+
class MyNewJob < ApplicationJob
|
|
114
|
+
source_monitor_queue :fetch # or :scrape
|
|
115
|
+
|
|
116
|
+
discard_on ActiveJob::DeserializationError
|
|
117
|
+
|
|
118
|
+
def perform(record_id)
|
|
119
|
+
record = SourceMonitor::Source.find_by(id: record_id)
|
|
120
|
+
return unless record
|
|
121
|
+
|
|
122
|
+
# Delegate to a service/model method
|
|
123
|
+
SourceMonitor::MyService.new(record: record).call
|
|
124
|
+
end
|
|
125
|
+
end
|
|
126
|
+
end
|
|
127
|
+
```
|
|
128
|
+
|
|
129
|
+
### Steps
|
|
130
|
+
|
|
131
|
+
1. Create file at `app/jobs/source_monitor/my_new_job.rb`
|
|
132
|
+
2. Inherit from `SourceMonitor::ApplicationJob`
|
|
133
|
+
3. Call `source_monitor_queue` with `:fetch` or `:scrape`
|
|
134
|
+
4. Add `discard_on ActiveJob::DeserializationError`
|
|
135
|
+
5. Accept IDs as arguments, guard with `find_by`
|
|
136
|
+
6. Delegate to service/model -- no business logic in the job
|
|
137
|
+
7. Write tests in `test/jobs/source_monitor/my_new_job_test.rb`
|
|
138
|
+
|
|
139
|
+
## Queue Configuration
|
|
140
|
+
|
|
141
|
+
### Engine Configuration
|
|
142
|
+
|
|
143
|
+
```ruby
|
|
144
|
+
SourceMonitor.configure do |config|
|
|
145
|
+
config.queue_namespace = "source_monitor" # Base namespace
|
|
146
|
+
config.fetch_queue_name = "source_monitor_fetch" # Fetch queue name
|
|
147
|
+
config.scrape_queue_name = "source_monitor_scrape" # Scrape queue name
|
|
148
|
+
config.fetch_queue_concurrency = 2 # Concurrent fetch workers
|
|
149
|
+
config.scrape_queue_concurrency = 2 # Concurrent scrape workers
|
|
150
|
+
end
|
|
151
|
+
```
|
|
152
|
+
|
|
153
|
+
### Queue Name Resolution
|
|
154
|
+
|
|
155
|
+
```ruby
|
|
156
|
+
SourceMonitor.queue_name(:fetch) # => "source_monitor_fetch"
|
|
157
|
+
# With host app prefix "myapp": => "myapp_source_monitor_fetch"
|
|
158
|
+
```
|
|
159
|
+
|
|
160
|
+
### Recurring Jobs
|
|
161
|
+
|
|
162
|
+
ScheduleFetchesJob is typically configured as a recurring job in `config/recurring.yml`:
|
|
163
|
+
|
|
164
|
+
```yaml
|
|
165
|
+
production:
|
|
166
|
+
schedule_fetches:
|
|
167
|
+
class: SourceMonitor::ScheduleFetchesJob
|
|
168
|
+
schedule: every 1 minute
|
|
169
|
+
queue: source_monitor_fetch
|
|
170
|
+
```
|
|
171
|
+
|
|
172
|
+
## Retry Policies
|
|
173
|
+
|
|
174
|
+
FetchFeedJob uses a custom retry strategy via `RetryPolicy`:
|
|
175
|
+
|
|
176
|
+
| Error Type | Retry Attempts | Wait | Circuit Breaker |
|
|
177
|
+
|------------|---------------|------|-----------------|
|
|
178
|
+
| Timeout | 2 | 2 min | 1 hour |
|
|
179
|
+
| Connection | 3 | 5 min | 1 hour |
|
|
180
|
+
| HTTP 429 | 2 | 15 min | 90 min |
|
|
181
|
+
| HTTP 5xx | 2 | 10 min | 90 min |
|
|
182
|
+
| HTTP 4xx | 1 | 45 min | 2 hours |
|
|
183
|
+
| Parsing | 1 | 30 min | 2 hours |
|
|
184
|
+
| Unexpected | 1 | 30 min | 2 hours |
|
|
185
|
+
|
|
186
|
+
## CleanupOptions Helper
|
|
187
|
+
|
|
188
|
+
`SourceMonitor::Jobs::CleanupOptions` normalizes job arguments for cleanup jobs:
|
|
189
|
+
|
|
190
|
+
```ruby
|
|
191
|
+
options = CleanupOptions.normalize(options) # Symbolize keys, handle nil
|
|
192
|
+
now = CleanupOptions.resolve_time(options[:now]) # Parse time strings
|
|
193
|
+
ids = CleanupOptions.extract_ids(options[:source_ids]) # Flatten/parse IDs
|
|
194
|
+
batch_size = CleanupOptions.batch_size(options, default: 100) # Safe integer
|
|
195
|
+
```
|
|
196
|
+
|
|
197
|
+
## Testing
|
|
198
|
+
|
|
199
|
+
### Test Template
|
|
200
|
+
|
|
201
|
+
```ruby
|
|
202
|
+
# test/jobs/source_monitor/my_new_job_test.rb
|
|
203
|
+
# frozen_string_literal: true
|
|
204
|
+
|
|
205
|
+
require "test_helper"
|
|
206
|
+
|
|
207
|
+
module SourceMonitor
|
|
208
|
+
class MyNewJobTest < ActiveJob::TestCase
|
|
209
|
+
setup do
|
|
210
|
+
@source = create_source!
|
|
211
|
+
end
|
|
212
|
+
|
|
213
|
+
test "performs work for valid source" do
|
|
214
|
+
# Stub external calls
|
|
215
|
+
MyService.any_instance.expects(:call).once
|
|
216
|
+
|
|
217
|
+
MyNewJob.perform_now(@source.id)
|
|
218
|
+
end
|
|
219
|
+
|
|
220
|
+
test "silently skips missing source" do
|
|
221
|
+
assert_nothing_raised do
|
|
222
|
+
MyNewJob.perform_now(-1)
|
|
223
|
+
end
|
|
224
|
+
end
|
|
225
|
+
|
|
226
|
+
test "enqueues on correct queue" do
|
|
227
|
+
assert_enqueued_with(job: MyNewJob, queue: SourceMonitor.queue_name(:fetch).to_s) do
|
|
228
|
+
MyNewJob.perform_later(@source.id)
|
|
229
|
+
end
|
|
230
|
+
end
|
|
231
|
+
end
|
|
232
|
+
end
|
|
233
|
+
```
|
|
234
|
+
|
|
235
|
+
### Testing Enqueue from Models
|
|
236
|
+
|
|
237
|
+
```ruby
|
|
238
|
+
test "fetching enqueues via FetchRunner.enqueue" do
|
|
239
|
+
with_inline_jobs do
|
|
240
|
+
stub_request(:get, source.feed_url).to_return(status: 200, body: feed_xml)
|
|
241
|
+
SourceMonitor::Fetching::FetchRunner.enqueue(source)
|
|
242
|
+
end
|
|
243
|
+
end
|
|
244
|
+
```
|
|
245
|
+
|
|
246
|
+
## Checklist
|
|
247
|
+
|
|
248
|
+
- [ ] Job inherits from `SourceMonitor::ApplicationJob`
|
|
249
|
+
- [ ] Uses `source_monitor_queue` (not `queue_as`)
|
|
250
|
+
- [ ] Accepts IDs, not AR objects
|
|
251
|
+
- [ ] Guards with `find_by` + early return
|
|
252
|
+
- [ ] No business logic in the job class
|
|
253
|
+
- [ ] `discard_on ActiveJob::DeserializationError`
|
|
254
|
+
- [ ] Error handling with `retry_on` where appropriate
|
|
255
|
+
- [ ] Test covers perform, missing record, and queue assignment
|
|
256
|
+
- [ ] All tests GREEN
|
|
257
|
+
|
|
258
|
+
## References
|
|
259
|
+
|
|
260
|
+
- `app/jobs/source_monitor/` -- All engine jobs
|
|
261
|
+
- `lib/source_monitor/jobs/` -- Job support classes (CleanupOptions, Visibility, SolidQueueMetrics)
|
|
262
|
+
- `lib/source_monitor/configuration.rb` -- Queue configuration
|
|
263
|
+
- `test/jobs/source_monitor/` -- Job tests
|
|
@@ -0,0 +1,245 @@
|
|
|
1
|
+
# Job Conventions Reference
|
|
2
|
+
|
|
3
|
+
## ApplicationJob Base Class
|
|
4
|
+
|
|
5
|
+
All engine jobs inherit from `SourceMonitor::ApplicationJob`:
|
|
6
|
+
|
|
7
|
+
```ruby
|
|
8
|
+
# app/jobs/source_monitor/application_job.rb
|
|
9
|
+
module SourceMonitor
|
|
10
|
+
parent_job = defined?(::ApplicationJob) ? ::ApplicationJob : ActiveJob::Base
|
|
11
|
+
|
|
12
|
+
class ApplicationJob < parent_job
|
|
13
|
+
class << self
|
|
14
|
+
def source_monitor_queue(role)
|
|
15
|
+
queue_as SourceMonitor.queue_name(role)
|
|
16
|
+
end
|
|
17
|
+
end
|
|
18
|
+
end
|
|
19
|
+
end
|
|
20
|
+
```
|
|
21
|
+
|
|
22
|
+
Key behaviors:
|
|
23
|
+
- Inherits from host app's `ApplicationJob` if available, otherwise `ActiveJob::Base`
|
|
24
|
+
- Provides `source_monitor_queue` class method for engine-aware queue naming
|
|
25
|
+
- Host app job middleware (logging, error tracking) applies automatically
|
|
26
|
+
|
|
27
|
+
## Queue Naming
|
|
28
|
+
|
|
29
|
+
### Configuration Chain
|
|
30
|
+
|
|
31
|
+
```
|
|
32
|
+
SourceMonitor.queue_name(:fetch)
|
|
33
|
+
-> config.queue_name_for(:fetch)
|
|
34
|
+
-> config.fetch_queue_name # "source_monitor_fetch"
|
|
35
|
+
-> prepend ActiveJob::Base.queue_name_prefix if set
|
|
36
|
+
```
|
|
37
|
+
|
|
38
|
+
### Default Names
|
|
39
|
+
|
|
40
|
+
| Role | Queue Name |
|
|
41
|
+
|------|-----------|
|
|
42
|
+
| `:fetch` | `source_monitor_fetch` |
|
|
43
|
+
| `:scrape` | `source_monitor_scrape` |
|
|
44
|
+
|
|
45
|
+
### With Host App Prefix
|
|
46
|
+
|
|
47
|
+
If the host app sets `ActiveJob::Base.queue_name_prefix = "myapp"`:
|
|
48
|
+
- Fetch queue becomes `myapp_source_monitor_fetch`
|
|
49
|
+
- Scrape queue becomes `myapp_source_monitor_scrape`
|
|
50
|
+
|
|
51
|
+
## Job Patterns by Type
|
|
52
|
+
|
|
53
|
+
### Fetch Job (FetchFeedJob)
|
|
54
|
+
|
|
55
|
+
The most complex job, demonstrating retry strategy integration:
|
|
56
|
+
|
|
57
|
+
```ruby
|
|
58
|
+
class FetchFeedJob < ApplicationJob
|
|
59
|
+
FETCH_CONCURRENCY_RETRY_WAIT = 30.seconds
|
|
60
|
+
EARLY_EXECUTION_LEEWAY = 30.seconds
|
|
61
|
+
|
|
62
|
+
source_monitor_queue :fetch
|
|
63
|
+
|
|
64
|
+
discard_on ActiveJob::DeserializationError
|
|
65
|
+
retry_on FetchRunner::ConcurrencyError, wait: 30.seconds, attempts: 5
|
|
66
|
+
|
|
67
|
+
def perform(source_id, force: false)
|
|
68
|
+
source = Source.find_by(id: source_id)
|
|
69
|
+
return unless source
|
|
70
|
+
return unless should_run?(source, force: force)
|
|
71
|
+
FetchRunner.new(source: source, force: force).run
|
|
72
|
+
rescue FetchError => error
|
|
73
|
+
handle_transient_error(source, error)
|
|
74
|
+
end
|
|
75
|
+
end
|
|
76
|
+
```
|
|
77
|
+
|
|
78
|
+
Notable patterns:
|
|
79
|
+
- `should_run?` guard prevents premature execution
|
|
80
|
+
- `ConcurrencyError` uses ActiveJob `retry_on` (another worker holds the lock)
|
|
81
|
+
- `FetchError` uses custom retry logic via `RetryPolicy`
|
|
82
|
+
- `force: false` keyword argument for manual vs scheduled fetches
|
|
83
|
+
|
|
84
|
+
### Cleanup Job (ItemCleanupJob)
|
|
85
|
+
|
|
86
|
+
Demonstrates options normalization pattern:
|
|
87
|
+
|
|
88
|
+
```ruby
|
|
89
|
+
class ItemCleanupJob < ApplicationJob
|
|
90
|
+
DEFAULT_BATCH_SIZE = 100
|
|
91
|
+
source_monitor_queue :fetch
|
|
92
|
+
|
|
93
|
+
def perform(options = nil)
|
|
94
|
+
options = Jobs::CleanupOptions.normalize(options)
|
|
95
|
+
scope = resolve_scope(options)
|
|
96
|
+
batch_size = Jobs::CleanupOptions.batch_size(options, default: DEFAULT_BATCH_SIZE)
|
|
97
|
+
now = Jobs::CleanupOptions.resolve_time(options[:now])
|
|
98
|
+
strategy = resolve_strategy(options)
|
|
99
|
+
|
|
100
|
+
scope.find_in_batches(batch_size:) do |batch|
|
|
101
|
+
batch.each { |source| RetentionPruner.call(source:, now:, strategy:) }
|
|
102
|
+
end
|
|
103
|
+
end
|
|
104
|
+
end
|
|
105
|
+
```
|
|
106
|
+
|
|
107
|
+
Notable patterns:
|
|
108
|
+
- Accepts flexible `options` hash (works with both manual and recurring invocation)
|
|
109
|
+
- Uses `CleanupOptions` helper for safe normalization
|
|
110
|
+
- Batched processing with configurable batch size
|
|
111
|
+
|
|
112
|
+
### Worker Job (ScrapeItemJob)
|
|
113
|
+
|
|
114
|
+
Demonstrates lifecycle logging:
|
|
115
|
+
|
|
116
|
+
```ruby
|
|
117
|
+
class ScrapeItemJob < ApplicationJob
|
|
118
|
+
source_monitor_queue :scrape
|
|
119
|
+
discard_on ActiveJob::DeserializationError
|
|
120
|
+
|
|
121
|
+
def perform(item_id)
|
|
122
|
+
log("job:start", item_id: item_id)
|
|
123
|
+
item = Item.includes(:source).find_by(id: item_id)
|
|
124
|
+
return unless item
|
|
125
|
+
|
|
126
|
+
source = item.source
|
|
127
|
+
unless source&.scraping_enabled?
|
|
128
|
+
log("job:skipped_scraping_disabled", item: item)
|
|
129
|
+
Scraping::State.clear_inflight!(item)
|
|
130
|
+
return
|
|
131
|
+
end
|
|
132
|
+
|
|
133
|
+
Scraping::State.mark_processing!(item)
|
|
134
|
+
Scraping::ItemScraper.new(item:, source:).call
|
|
135
|
+
log("job:completed", item: item, status: item.scrape_status)
|
|
136
|
+
rescue StandardError => error
|
|
137
|
+
log("job:error", item: item, error: error.message)
|
|
138
|
+
Scraping::State.mark_failed!(item)
|
|
139
|
+
raise
|
|
140
|
+
ensure
|
|
141
|
+
Scraping::State.clear_inflight!(item) if item
|
|
142
|
+
end
|
|
143
|
+
end
|
|
144
|
+
```
|
|
145
|
+
|
|
146
|
+
Notable patterns:
|
|
147
|
+
- `includes(:source)` prevents N+1 query
|
|
148
|
+
- Lifecycle state management (`mark_processing!`, `clear_inflight!`)
|
|
149
|
+
- Error re-raise after state cleanup
|
|
150
|
+
- Structured JSON logging at each stage
|
|
151
|
+
|
|
152
|
+
### Scheduling Job (ScheduleFetchesJob)
|
|
153
|
+
|
|
154
|
+
Simplest pattern -- pure delegation:
|
|
155
|
+
|
|
156
|
+
```ruby
|
|
157
|
+
class ScheduleFetchesJob < ApplicationJob
|
|
158
|
+
source_monitor_queue :fetch
|
|
159
|
+
|
|
160
|
+
def perform(options = nil)
|
|
161
|
+
limit = extract_limit(options)
|
|
162
|
+
Scheduler.run(limit:)
|
|
163
|
+
end
|
|
164
|
+
end
|
|
165
|
+
```
|
|
166
|
+
|
|
167
|
+
### Broadcast Job (SourceHealthCheckJob)
|
|
168
|
+
|
|
169
|
+
Demonstrates result broadcasting:
|
|
170
|
+
|
|
171
|
+
```ruby
|
|
172
|
+
class SourceHealthCheckJob < ApplicationJob
|
|
173
|
+
source_monitor_queue :fetch
|
|
174
|
+
discard_on ActiveJob::DeserializationError
|
|
175
|
+
|
|
176
|
+
def perform(source_id)
|
|
177
|
+
source = Source.find_by(id: source_id)
|
|
178
|
+
return unless source
|
|
179
|
+
|
|
180
|
+
result = Health::SourceHealthCheck.new(source: source).call
|
|
181
|
+
broadcast_outcome(source, result)
|
|
182
|
+
result
|
|
183
|
+
rescue StandardError => error
|
|
184
|
+
record_unexpected_failure(source, error) if source
|
|
185
|
+
broadcast_outcome(source, nil, error) if source
|
|
186
|
+
nil
|
|
187
|
+
end
|
|
188
|
+
end
|
|
189
|
+
```
|
|
190
|
+
|
|
191
|
+
Notable patterns:
|
|
192
|
+
- Always broadcasts UI update (success or failure)
|
|
193
|
+
- Creates log record even for unexpected failures
|
|
194
|
+
- Returns nil on error instead of re-raising (health checks are non-critical)
|
|
195
|
+
|
|
196
|
+
## _later / _now Naming Convention
|
|
197
|
+
|
|
198
|
+
Models and services should expose `_later` methods for async work:
|
|
199
|
+
|
|
200
|
+
```ruby
|
|
201
|
+
# On the model or service
|
|
202
|
+
def self.fetch_later(source_or_id, force: false)
|
|
203
|
+
FetchRunner.enqueue(source_or_id, force: force)
|
|
204
|
+
end
|
|
205
|
+
|
|
206
|
+
def self.fetch_now(source, force: false)
|
|
207
|
+
FetchRunner.run(source: source, force: force)
|
|
208
|
+
end
|
|
209
|
+
```
|
|
210
|
+
|
|
211
|
+
Jobs are the mechanism, not the API. Callers should use model/service methods, not enqueue jobs directly.
|
|
212
|
+
|
|
213
|
+
## Job Support Classes
|
|
214
|
+
|
|
215
|
+
### CleanupOptions
|
|
216
|
+
|
|
217
|
+
**File:** `lib/source_monitor/jobs/cleanup_options.rb`
|
|
218
|
+
|
|
219
|
+
Normalizes job arguments for cleanup jobs:
|
|
220
|
+
|
|
221
|
+
| Method | Purpose |
|
|
222
|
+
|--------|---------|
|
|
223
|
+
| `normalize(options)` | Symbolize keys, handle nil/non-Hash |
|
|
224
|
+
| `resolve_time(value)` | Parse Time/String/nil to Time |
|
|
225
|
+
| `extract_ids(value)` | Flatten arrays, split CSV, convert to integers |
|
|
226
|
+
| `integer(value)` | Safe Integer conversion |
|
|
227
|
+
| `batch_size(options, default:)` | Extract positive batch size |
|
|
228
|
+
|
|
229
|
+
### FetchFailureSubscriber
|
|
230
|
+
|
|
231
|
+
**File:** `lib/source_monitor/jobs/fetch_failure_subscriber.rb`
|
|
232
|
+
|
|
233
|
+
Subscribes to Solid Queue failure events for fetch queue jobs. Used for metrics and alerting.
|
|
234
|
+
|
|
235
|
+
### Visibility
|
|
236
|
+
|
|
237
|
+
**File:** `lib/source_monitor/jobs/visibility.rb`
|
|
238
|
+
|
|
239
|
+
Tracks queue depth and timing metrics per queue.
|
|
240
|
+
|
|
241
|
+
### SolidQueueMetrics
|
|
242
|
+
|
|
243
|
+
**File:** `lib/source_monitor/jobs/solid_queue_metrics.rb`
|
|
244
|
+
|
|
245
|
+
Queries Solid Queue tables for dashboard metrics: pending count, failed count, paused queues, oldest job age.
|