source_monitor 0.3.0 → 0.3.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/.claude/skills/sm-architecture/SKILL.md +233 -0
- data/.claude/skills/sm-architecture/reference/extraction-patterns.md +192 -0
- data/.claude/skills/sm-architecture/reference/module-map.md +194 -0
- data/.claude/skills/sm-configuration-setting/SKILL.md +264 -0
- data/.claude/skills/sm-configuration-setting/reference/settings-catalog.md +248 -0
- data/.claude/skills/sm-configuration-setting/reference/settings-pattern.md +297 -0
- data/.claude/skills/sm-configure/SKILL.md +153 -0
- data/.claude/skills/sm-configure/reference/configuration-reference.md +321 -0
- data/.claude/skills/sm-dashboard-widget/SKILL.md +344 -0
- data/.claude/skills/sm-dashboard-widget/reference/dashboard-patterns.md +304 -0
- data/.claude/skills/sm-domain-model/SKILL.md +188 -0
- data/.claude/skills/sm-domain-model/reference/model-graph.md +114 -0
- data/.claude/skills/sm-domain-model/reference/table-structure.md +348 -0
- data/.claude/skills/sm-engine-migration/SKILL.md +395 -0
- data/.claude/skills/sm-engine-migration/reference/migration-conventions.md +255 -0
- data/.claude/skills/sm-engine-test/SKILL.md +302 -0
- data/.claude/skills/sm-engine-test/reference/test-helpers.md +259 -0
- data/.claude/skills/sm-engine-test/reference/test-patterns.md +411 -0
- data/.claude/skills/sm-event-handler/SKILL.md +265 -0
- data/.claude/skills/sm-event-handler/reference/events-api.md +229 -0
- data/.claude/skills/sm-health-rule/SKILL.md +327 -0
- data/.claude/skills/sm-health-rule/reference/health-system.md +269 -0
- data/.claude/skills/sm-host-setup/SKILL.md +223 -0
- data/.claude/skills/sm-host-setup/reference/initializer-template.md +195 -0
- data/.claude/skills/sm-host-setup/reference/setup-checklist.md +134 -0
- data/.claude/skills/sm-job/SKILL.md +263 -0
- data/.claude/skills/sm-job/reference/job-conventions.md +245 -0
- data/.claude/skills/sm-model-extension/SKILL.md +287 -0
- data/.claude/skills/sm-model-extension/reference/extension-api.md +317 -0
- data/.claude/skills/sm-pipeline-stage/SKILL.md +254 -0
- data/.claude/skills/sm-pipeline-stage/reference/completion-handlers.md +152 -0
- data/.claude/skills/sm-pipeline-stage/reference/entry-processing.md +191 -0
- data/.claude/skills/sm-pipeline-stage/reference/feed-fetcher-architecture.md +198 -0
- data/.claude/skills/sm-scraper-adapter/SKILL.md +284 -0
- data/.claude/skills/sm-scraper-adapter/reference/adapter-contract.md +167 -0
- data/.claude/skills/sm-scraper-adapter/reference/example-adapter.md +274 -0
- data/.vbw-planning/.notification-log.jsonl +102 -0
- data/.vbw-planning/.session-log.jsonl +505 -0
- data/AGENTS.md +20 -57
- data/CHANGELOG.md +19 -0
- data/CLAUDE.md +44 -1
- data/CONTRIBUTING.md +5 -5
- data/Gemfile.lock +20 -21
- data/README.md +18 -5
- data/VERSION +1 -0
- data/docs/deployment.md +1 -1
- data/docs/setup.md +4 -4
- data/lib/source_monitor/setup/skills_installer.rb +94 -0
- data/lib/source_monitor/setup/workflow.rb +17 -2
- data/lib/source_monitor/version.rb +1 -1
- data/lib/tasks/source_monitor_setup.rake +58 -0
- data/source_monitor.gemspec +1 -0
- metadata +39 -1
|
@@ -0,0 +1,229 @@
|
|
|
1
|
+
# Events API Reference
|
|
2
|
+
|
|
3
|
+
Complete reference for SourceMonitor's event system.
|
|
4
|
+
|
|
5
|
+
Source: `lib/source_monitor/events.rb` and `lib/source_monitor/configuration/events.rb`
|
|
6
|
+
|
|
7
|
+
## Registration API
|
|
8
|
+
|
|
9
|
+
All registration happens on `config.events` inside the configure block:
|
|
10
|
+
|
|
11
|
+
```ruby
|
|
12
|
+
SourceMonitor.configure do |config|
|
|
13
|
+
config.events.after_item_created { |event| ... }
|
|
14
|
+
config.events.after_item_scraped(handler)
|
|
15
|
+
config.events.after_fetch_completed(MyHandler.new)
|
|
16
|
+
config.events.register_item_processor(->(ctx) { ... })
|
|
17
|
+
end
|
|
18
|
+
```
|
|
19
|
+
|
|
20
|
+
### `after_item_created(handler = nil, &block)`
|
|
21
|
+
|
|
22
|
+
Register a callback for new item creation.
|
|
23
|
+
|
|
24
|
+
**Handler requirements:** Must respond to `#call`. Receives an `ItemCreatedEvent`.
|
|
25
|
+
|
|
26
|
+
**Returns:** The registered callable.
|
|
27
|
+
|
|
28
|
+
### `after_item_scraped(handler = nil, &block)`
|
|
29
|
+
|
|
30
|
+
Register a callback for item scrape completion.
|
|
31
|
+
|
|
32
|
+
**Handler requirements:** Must respond to `#call`. Receives an `ItemScrapedEvent`.
|
|
33
|
+
|
|
34
|
+
**Returns:** The registered callable.
|
|
35
|
+
|
|
36
|
+
### `after_fetch_completed(handler = nil, &block)`
|
|
37
|
+
|
|
38
|
+
Register a callback for feed fetch completion.
|
|
39
|
+
|
|
40
|
+
**Handler requirements:** Must respond to `#call`. Receives a `FetchCompletedEvent`.
|
|
41
|
+
|
|
42
|
+
**Returns:** The registered callable.
|
|
43
|
+
|
|
44
|
+
### `register_item_processor(processor = nil, &block)`
|
|
45
|
+
|
|
46
|
+
Register an item processor for post-entry processing.
|
|
47
|
+
|
|
48
|
+
**Handler requirements:** Must respond to `#call`. Receives an `ItemProcessorContext`.
|
|
49
|
+
|
|
50
|
+
**Returns:** The registered callable.
|
|
51
|
+
|
|
52
|
+
### `callbacks_for(name) -> Array`
|
|
53
|
+
|
|
54
|
+
Retrieve a copy of registered callbacks for a given event name.
|
|
55
|
+
|
|
56
|
+
### `item_processors -> Array`
|
|
57
|
+
|
|
58
|
+
Retrieve a copy of registered item processors.
|
|
59
|
+
|
|
60
|
+
### `reset!`
|
|
61
|
+
|
|
62
|
+
Clear all callbacks and item processors. Used in tests.
|
|
63
|
+
|
|
64
|
+
## Event Structs
|
|
65
|
+
|
|
66
|
+
### `ItemCreatedEvent`
|
|
67
|
+
|
|
68
|
+
Fired by `Events.after_item_created` after a new item is created from a feed entry.
|
|
69
|
+
|
|
70
|
+
```ruby
|
|
71
|
+
ItemCreatedEvent = Struct.new(
|
|
72
|
+
:item, # SourceMonitor::Item - the newly created item
|
|
73
|
+
:source, # SourceMonitor::Source - the owning source
|
|
74
|
+
:entry, # Object - raw feed entry from Feedjira
|
|
75
|
+
:result, # Object - creation result
|
|
76
|
+
:status, # String - result status (e.g., "created")
|
|
77
|
+
:occurred_at, # Time - when the event fired
|
|
78
|
+
keyword_init: true
|
|
79
|
+
)
|
|
80
|
+
```
|
|
81
|
+
|
|
82
|
+
**Helper methods:**
|
|
83
|
+
- `created?` -- returns `true` when `status.to_s == "created"`
|
|
84
|
+
|
|
85
|
+
**Dispatched from:** `SourceMonitor::Events.after_item_created` (called by `EntryProcessor`)
|
|
86
|
+
|
|
87
|
+
### `ItemScrapedEvent`
|
|
88
|
+
|
|
89
|
+
Fired by `Events.after_item_scraped` after content scraping completes.
|
|
90
|
+
|
|
91
|
+
```ruby
|
|
92
|
+
ItemScrapedEvent = Struct.new(
|
|
93
|
+
:item, # SourceMonitor::Item - the scraped item
|
|
94
|
+
:source, # SourceMonitor::Source - the owning source
|
|
95
|
+
:result, # Object - scrape result
|
|
96
|
+
:log, # SourceMonitor::ScrapeLog - the scrape log record
|
|
97
|
+
:status, # String - result status
|
|
98
|
+
:occurred_at, # Time - when the event fired
|
|
99
|
+
keyword_init: true
|
|
100
|
+
)
|
|
101
|
+
```
|
|
102
|
+
|
|
103
|
+
**Helper methods:**
|
|
104
|
+
- `success?` -- returns `true` when `status.to_s != "failed"`
|
|
105
|
+
|
|
106
|
+
**Dispatched from:** `SourceMonitor::Events.after_item_scraped` (called by `ItemScraper`)
|
|
107
|
+
|
|
108
|
+
### `FetchCompletedEvent`
|
|
109
|
+
|
|
110
|
+
Fired by `Events.after_fetch_completed` after a feed fetch finishes.
|
|
111
|
+
|
|
112
|
+
```ruby
|
|
113
|
+
FetchCompletedEvent = Struct.new(
|
|
114
|
+
:source, # SourceMonitor::Source - the fetched source
|
|
115
|
+
:result, # Object - fetch result
|
|
116
|
+
:status, # String - result status
|
|
117
|
+
:occurred_at, # Time - when the event fired
|
|
118
|
+
keyword_init: true
|
|
119
|
+
)
|
|
120
|
+
```
|
|
121
|
+
|
|
122
|
+
**Dispatched from:** `SourceMonitor::Events.after_fetch_completed` (called by `Completion::EventPublisher`)
|
|
123
|
+
|
|
124
|
+
### `ItemProcessorContext`
|
|
125
|
+
|
|
126
|
+
Passed to item processors registered via `register_item_processor`.
|
|
127
|
+
|
|
128
|
+
```ruby
|
|
129
|
+
ItemProcessorContext = Struct.new(
|
|
130
|
+
:item, # SourceMonitor::Item - the processed item
|
|
131
|
+
:source, # SourceMonitor::Source - the owning source
|
|
132
|
+
:entry, # Object - raw feed entry
|
|
133
|
+
:result, # Object - processing result
|
|
134
|
+
:status, # String - result status
|
|
135
|
+
:occurred_at, # Time - when processing occurred
|
|
136
|
+
keyword_init: true
|
|
137
|
+
)
|
|
138
|
+
```
|
|
139
|
+
|
|
140
|
+
**Dispatched from:** `SourceMonitor::Events.run_item_processors` (called by `EntryProcessor`)
|
|
141
|
+
|
|
142
|
+
## Dispatch Mechanics
|
|
143
|
+
|
|
144
|
+
### `Events.dispatch(event_name, event)`
|
|
145
|
+
|
|
146
|
+
Iterates all callbacks for the event name and calls each one:
|
|
147
|
+
|
|
148
|
+
```ruby
|
|
149
|
+
def dispatch(event_name, event)
|
|
150
|
+
SourceMonitor.config.events.callbacks_for(event_name).each do |callback|
|
|
151
|
+
invoke(callback, event)
|
|
152
|
+
rescue StandardError => error
|
|
153
|
+
log_handler_error(event_name, callback, error)
|
|
154
|
+
end
|
|
155
|
+
end
|
|
156
|
+
```
|
|
157
|
+
|
|
158
|
+
### `Events.invoke(callable, event)`
|
|
159
|
+
|
|
160
|
+
Handles zero-arity and single-arity callables:
|
|
161
|
+
|
|
162
|
+
```ruby
|
|
163
|
+
def invoke(callable, event)
|
|
164
|
+
if callable.respond_to?(:arity) && callable.arity.zero?
|
|
165
|
+
callable.call
|
|
166
|
+
else
|
|
167
|
+
callable.call(event)
|
|
168
|
+
end
|
|
169
|
+
end
|
|
170
|
+
```
|
|
171
|
+
|
|
172
|
+
### Error Logging
|
|
173
|
+
|
|
174
|
+
Handler errors are logged but never propagated:
|
|
175
|
+
|
|
176
|
+
```
|
|
177
|
+
[SourceMonitor] after_item_created handler #<Proc:0x...> failed: RuntimeError: boom
|
|
178
|
+
```
|
|
179
|
+
|
|
180
|
+
Logged via `Rails.logger.error` with `warn` as fallback.
|
|
181
|
+
|
|
182
|
+
## Handler Types
|
|
183
|
+
|
|
184
|
+
| Type | Example | Notes |
|
|
185
|
+
|---|---|---|
|
|
186
|
+
| Block | `after_item_created { \|e\| ... }` | Most common for simple handlers |
|
|
187
|
+
| Lambda | `after_item_created ->( e) { ... }` | Strict arity checking |
|
|
188
|
+
| Proc | `after_item_created proc { \|e\| ... }` | Relaxed arity |
|
|
189
|
+
| Object | `after_item_created(MyHandler.new)` | Must define `#call` |
|
|
190
|
+
| Zero-arity | `after_item_created -> { ... }` | Called without event argument |
|
|
191
|
+
|
|
192
|
+
## Multiple Handlers
|
|
193
|
+
|
|
194
|
+
Multiple handlers can be registered for the same event. They execute in registration order:
|
|
195
|
+
|
|
196
|
+
```ruby
|
|
197
|
+
config.events.after_item_created { |e| log(e) } # runs first
|
|
198
|
+
config.events.after_item_created { |e| notify(e) } # runs second
|
|
199
|
+
config.events.after_item_created { |e| index(e) } # runs third
|
|
200
|
+
```
|
|
201
|
+
|
|
202
|
+
If handler 2 raises, handlers 1 and 3 still execute (error is caught after each).
|
|
203
|
+
|
|
204
|
+
## Callback Keys
|
|
205
|
+
|
|
206
|
+
The `CALLBACK_KEYS` constant defines valid event names:
|
|
207
|
+
|
|
208
|
+
```ruby
|
|
209
|
+
CALLBACK_KEYS = %i[after_item_created after_item_scraped after_fetch_completed].freeze
|
|
210
|
+
```
|
|
211
|
+
|
|
212
|
+
Registering an unknown event raises `ArgumentError`.
|
|
213
|
+
|
|
214
|
+
## Pipeline Integration Points
|
|
215
|
+
|
|
216
|
+
| Event | Triggered By | File |
|
|
217
|
+
|---|---|---|
|
|
218
|
+
| `after_item_created` | `EntryProcessor` after creating item | `lib/source_monitor/fetching/feed_fetcher/entry_processor.rb` |
|
|
219
|
+
| `after_item_scraped` | `ItemScraper` after scraping | `lib/source_monitor/scraping/item_scraper.rb` |
|
|
220
|
+
| `after_fetch_completed` | `EventPublisher` after fetch | `lib/source_monitor/fetching/completion/event_publisher.rb` |
|
|
221
|
+
| Item processors | `EntryProcessor` after item created | `lib/source_monitor/fetching/feed_fetcher/entry_processor.rb` |
|
|
222
|
+
|
|
223
|
+
## Best Practices
|
|
224
|
+
|
|
225
|
+
1. **Keep handlers lightweight** -- heavy work should be enqueued as background jobs
|
|
226
|
+
2. **Handlers should be idempotent** -- they may be retried or run multiple times
|
|
227
|
+
3. **Never raise in handlers** -- errors are caught but indicate problems
|
|
228
|
+
4. **Use item processors for normalization** -- they run close to creation
|
|
229
|
+
5. **Use event callbacks for side effects** -- notifications, indexing, analytics
|
|
@@ -0,0 +1,327 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: sm-health-rule
|
|
3
|
+
description: Health status rules, circuit breaker, and auto-pause logic for SourceMonitor sources. Use when working with health checks, health status transitions, auto-pause thresholds, circuit breaker behavior, or adding new health rules.
|
|
4
|
+
allowed-tools: Read, Write, Edit, Bash, Glob, Grep
|
|
5
|
+
disable-model-invocation: true
|
|
6
|
+
---
|
|
7
|
+
|
|
8
|
+
# SourceMonitor Health Rule Development
|
|
9
|
+
|
|
10
|
+
## Overview
|
|
11
|
+
|
|
12
|
+
The health system monitors source reliability by tracking fetch success rates and automatically pausing unreliable sources. It consists of three main components:
|
|
13
|
+
|
|
14
|
+
1. **SourceHealthMonitor** -- evaluates rolling success rate, determines health status, triggers auto-pause/resume
|
|
15
|
+
2. **SourceHealthCheck** -- performs on-demand HTTP health checks
|
|
16
|
+
3. **SourceHealthReset** -- resets all health state for a source
|
|
17
|
+
|
|
18
|
+
## Architecture
|
|
19
|
+
|
|
20
|
+
```
|
|
21
|
+
Health Module (setup!)
|
|
22
|
+
|
|
|
23
|
+
+-- Registers callback: after_fetch_completed -> SourceHealthMonitor
|
|
24
|
+
|
|
|
25
|
+
+-- SourceHealthMonitor (per-fetch evaluation)
|
|
26
|
+
| +-- Reads recent fetch_logs
|
|
27
|
+
| +-- Calculates rolling_success_rate
|
|
28
|
+
| +-- Determines health_status
|
|
29
|
+
| +-- Triggers auto-pause / auto-resume
|
|
30
|
+
|
|
|
31
|
+
+-- SourceHealthCheck (on-demand)
|
|
32
|
+
| +-- HTTP GET to feed_url
|
|
33
|
+
| +-- Creates HealthCheckLog record
|
|
34
|
+
| +-- Returns Result struct
|
|
35
|
+
|
|
|
36
|
+
+-- SourceHealthReset (manual reset)
|
|
37
|
+
+-- Clears all health state
|
|
38
|
+
+-- Resets to "healthy" status
|
|
39
|
+
```
|
|
40
|
+
|
|
41
|
+
## Key Files
|
|
42
|
+
|
|
43
|
+
| File | Purpose | Lines |
|
|
44
|
+
|------|---------|-------|
|
|
45
|
+
| `lib/source_monitor/health.rb` | Module entry point, callback registration | 47 |
|
|
46
|
+
| `lib/source_monitor/health/source_health_monitor.rb` | Rolling success rate + status + auto-pause | 210 |
|
|
47
|
+
| `lib/source_monitor/health/source_health_check.rb` | On-demand HTTP health check | 100 |
|
|
48
|
+
| `lib/source_monitor/health/source_health_reset.rb` | Reset all health state | 68 |
|
|
49
|
+
| `lib/source_monitor/health/import_source_health_check.rb` | Health check for import candidates | 55 |
|
|
50
|
+
| `lib/source_monitor/configuration/health_settings.rb` | Configuration defaults | 27 |
|
|
51
|
+
| `app/models/source_monitor/health_check_log.rb` | Health check log record | 28 |
|
|
52
|
+
| `app/jobs/source_monitor/source_health_check_job.rb` | Background health check job | 77 |
|
|
53
|
+
|
|
54
|
+
## Health Status Values
|
|
55
|
+
|
|
56
|
+
| Status | Meaning | Trigger |
|
|
57
|
+
|--------|---------|---------|
|
|
58
|
+
| `healthy` | Source is reliable | success_rate >= healthy_threshold (0.8) |
|
|
59
|
+
| `warning` | Some failures occurring | success_rate >= warning_threshold (0.5) but < healthy |
|
|
60
|
+
| `critical` | High failure rate | success_rate < warning_threshold |
|
|
61
|
+
| `declining` | Consecutive failures | >= 3 consecutive failures in recent logs |
|
|
62
|
+
| `improving` | Recovery in progress | >= 2 consecutive successes after a failure |
|
|
63
|
+
| `auto_paused` | Automatically paused | success_rate < auto_pause_threshold (0.2) |
|
|
64
|
+
|
|
65
|
+
### Status Priority (highest to lowest)
|
|
66
|
+
|
|
67
|
+
```
|
|
68
|
+
auto_paused > declining > improving > healthy > warning > critical
|
|
69
|
+
```
|
|
70
|
+
|
|
71
|
+
Determination logic:
|
|
72
|
+
|
|
73
|
+
```ruby
|
|
74
|
+
def determine_status(rate, auto_paused_until, logs)
|
|
75
|
+
if auto_paused_active?(auto_paused_until)
|
|
76
|
+
"auto_paused"
|
|
77
|
+
elsif consecutive_failures(logs) >= 3
|
|
78
|
+
"declining"
|
|
79
|
+
elsif improving_streak?(logs)
|
|
80
|
+
"improving"
|
|
81
|
+
elsif rate >= healthy_threshold
|
|
82
|
+
"healthy"
|
|
83
|
+
elsif rate >= warning_threshold
|
|
84
|
+
"warning"
|
|
85
|
+
else
|
|
86
|
+
"critical"
|
|
87
|
+
end
|
|
88
|
+
end
|
|
89
|
+
```
|
|
90
|
+
|
|
91
|
+
## Health Configuration
|
|
92
|
+
|
|
93
|
+
### Default Settings
|
|
94
|
+
|
|
95
|
+
| Setting | Default | Purpose |
|
|
96
|
+
|---------|---------|---------|
|
|
97
|
+
| `window_size` | 20 | Number of recent fetch logs to evaluate |
|
|
98
|
+
| `healthy_threshold` | 0.8 | Success rate for "healthy" status |
|
|
99
|
+
| `warning_threshold` | 0.5 | Success rate for "warning" status |
|
|
100
|
+
| `auto_pause_threshold` | 0.2 | Below this, source is auto-paused |
|
|
101
|
+
| `auto_resume_threshold` | 0.6 | Above this, auto-pause is lifted |
|
|
102
|
+
| `auto_pause_cooldown_minutes` | 60 | Minimum pause duration |
|
|
103
|
+
|
|
104
|
+
### Per-Source Override
|
|
105
|
+
|
|
106
|
+
Sources can override `health_auto_pause_threshold` (validated 0-1 range):
|
|
107
|
+
|
|
108
|
+
```ruby
|
|
109
|
+
source.health_auto_pause_threshold = 0.3 # More tolerant than default 0.2
|
|
110
|
+
```
|
|
111
|
+
|
|
112
|
+
### Configuration Access
|
|
113
|
+
|
|
114
|
+
```ruby
|
|
115
|
+
SourceMonitor.configure do |config|
|
|
116
|
+
config.health.window_size = 30
|
|
117
|
+
config.health.auto_pause_threshold = 0.15
|
|
118
|
+
config.health.auto_pause_cooldown_minutes = 120
|
|
119
|
+
end
|
|
120
|
+
```
|
|
121
|
+
|
|
122
|
+
## SourceHealthMonitor
|
|
123
|
+
|
|
124
|
+
### How It Works
|
|
125
|
+
|
|
126
|
+
The monitor runs automatically after every fetch via the `after_fetch_completed` event callback.
|
|
127
|
+
|
|
128
|
+
**Step 1: Gather Data**
|
|
129
|
+
```ruby
|
|
130
|
+
logs = source.fetch_logs.order(started_at: :desc).limit(window_size)
|
|
131
|
+
rate = successes.to_f / total
|
|
132
|
+
```
|
|
133
|
+
|
|
134
|
+
**Step 2: Check Thresholds**
|
|
135
|
+
|
|
136
|
+
Thresholds only apply when `logs.size >= window_size` (minimum sample size).
|
|
137
|
+
|
|
138
|
+
**Step 3: Auto-Resume Check**
|
|
139
|
+
|
|
140
|
+
If source is currently auto-paused and success rate >= `auto_resume_threshold`:
|
|
141
|
+
- Clear `auto_paused_until` and `auto_paused_at`
|
|
142
|
+
- Clear `backoff_until`
|
|
143
|
+
|
|
144
|
+
**Step 4: Auto-Pause Check**
|
|
145
|
+
|
|
146
|
+
If success rate < `auto_pause_threshold`:
|
|
147
|
+
- Set `auto_paused_until` to `now + cooldown_minutes`
|
|
148
|
+
- Set `auto_paused_at` to now (or keep existing)
|
|
149
|
+
- Push `next_fetch_at` and `backoff_until` past the pause window
|
|
150
|
+
|
|
151
|
+
**Step 5: Fixed Interval Enforcement**
|
|
152
|
+
|
|
153
|
+
For non-adaptive sources that are not paused, clear `backoff_until` and reset `next_fetch_at` to the fixed interval.
|
|
154
|
+
|
|
155
|
+
**Step 6: Apply Status**
|
|
156
|
+
|
|
157
|
+
Only updates `health_status` and `health_status_changed_at` when the status actually changes.
|
|
158
|
+
|
|
159
|
+
### Source Fields Updated
|
|
160
|
+
|
|
161
|
+
| Field | Type | Purpose |
|
|
162
|
+
|-------|------|---------|
|
|
163
|
+
| `health_status` | string | Current health status |
|
|
164
|
+
| `health_status_changed_at` | datetime | When status last changed |
|
|
165
|
+
| `rolling_success_rate` | float | Current success rate (0.0-1.0) |
|
|
166
|
+
| `auto_paused_at` | datetime | When auto-pause was triggered |
|
|
167
|
+
| `auto_paused_until` | datetime | When auto-pause expires |
|
|
168
|
+
| `health_auto_pause_threshold` | float | Per-source override |
|
|
169
|
+
|
|
170
|
+
## Circuit Breaker (Fetch-Level)
|
|
171
|
+
|
|
172
|
+
Separate from health status, the fetch pipeline has its own circuit breaker via `RetryPolicy`:
|
|
173
|
+
|
|
174
|
+
| Field | Purpose |
|
|
175
|
+
|-------|---------|
|
|
176
|
+
| `fetch_retry_attempt` | Current retry count |
|
|
177
|
+
| `fetch_circuit_opened_at` | When circuit was opened |
|
|
178
|
+
| `fetch_circuit_until` | When circuit closes |
|
|
179
|
+
|
|
180
|
+
```ruby
|
|
181
|
+
def fetch_circuit_open?
|
|
182
|
+
fetch_circuit_until.present? && fetch_circuit_until.future?
|
|
183
|
+
end
|
|
184
|
+
```
|
|
185
|
+
|
|
186
|
+
Circuit breaker is managed by `RetryPolicy` in `SourceUpdater` and `FetchFeedJob`. It is distinct from the health system's auto-pause.
|
|
187
|
+
|
|
188
|
+
## SourceHealthCheck
|
|
189
|
+
|
|
190
|
+
On-demand HTTP check that creates a `HealthCheckLog`:
|
|
191
|
+
|
|
192
|
+
```ruby
|
|
193
|
+
result = SourceMonitor::Health::SourceHealthCheck.new(source: source).call
|
|
194
|
+
result.success? # => true/false
|
|
195
|
+
result.log # => HealthCheckLog record
|
|
196
|
+
result.error # => exception if failed
|
|
197
|
+
```
|
|
198
|
+
|
|
199
|
+
Features:
|
|
200
|
+
- Uses source's custom headers and conditional request headers (ETag, If-Modified-Since)
|
|
201
|
+
- HTTP 200-399 is considered successful
|
|
202
|
+
- Creates a HealthCheckLog record regardless of outcome
|
|
203
|
+
- Does NOT update health status (that's the monitor's job)
|
|
204
|
+
|
|
205
|
+
## SourceHealthReset
|
|
206
|
+
|
|
207
|
+
Manually resets all health state to defaults:
|
|
208
|
+
|
|
209
|
+
```ruby
|
|
210
|
+
SourceMonitor::Health::SourceHealthReset.call(source: source)
|
|
211
|
+
```
|
|
212
|
+
|
|
213
|
+
Resets:
|
|
214
|
+
- `health_status` -> "healthy"
|
|
215
|
+
- `auto_paused_at`, `auto_paused_until` -> nil
|
|
216
|
+
- `rolling_success_rate` -> nil
|
|
217
|
+
- `failure_count` -> 0
|
|
218
|
+
- `last_error`, `last_error_at` -> nil
|
|
219
|
+
- `backoff_until` -> nil
|
|
220
|
+
- `fetch_status` -> "idle"
|
|
221
|
+
- `fetch_retry_attempt` -> 0
|
|
222
|
+
- `fetch_circuit_opened_at`, `fetch_circuit_until` -> nil
|
|
223
|
+
- `next_fetch_at` -> calculated from fetch_interval_minutes
|
|
224
|
+
|
|
225
|
+
Uses `with_lock` for concurrency safety.
|
|
226
|
+
|
|
227
|
+
## ImportSourceHealthCheck
|
|
228
|
+
|
|
229
|
+
Lightweight health check for import candidates (no Source record needed):
|
|
230
|
+
|
|
231
|
+
```ruby
|
|
232
|
+
result = Health::ImportSourceHealthCheck.new(feed_url: url).call
|
|
233
|
+
result.status # => "healthy" or "unhealthy"
|
|
234
|
+
result.error_message # => nil or error description
|
|
235
|
+
result.http_status # => HTTP status code
|
|
236
|
+
```
|
|
237
|
+
|
|
238
|
+
## Adding a New Health Rule
|
|
239
|
+
|
|
240
|
+
To add a new condition that affects health status:
|
|
241
|
+
|
|
242
|
+
### Step 1: Define the Rule
|
|
243
|
+
|
|
244
|
+
Add a method to `SourceHealthMonitor`:
|
|
245
|
+
|
|
246
|
+
```ruby
|
|
247
|
+
def my_custom_condition?(logs)
|
|
248
|
+
# Evaluate logs or source state
|
|
249
|
+
# Return true if condition is met
|
|
250
|
+
end
|
|
251
|
+
```
|
|
252
|
+
|
|
253
|
+
### Step 2: Integrate into Status Determination
|
|
254
|
+
|
|
255
|
+
Add the condition to `determine_status`:
|
|
256
|
+
|
|
257
|
+
```ruby
|
|
258
|
+
def determine_status(rate, auto_paused_until, logs)
|
|
259
|
+
if auto_paused_active?(auto_paused_until)
|
|
260
|
+
"auto_paused"
|
|
261
|
+
elsif my_custom_condition?(logs) # Add here
|
|
262
|
+
"my_custom_status"
|
|
263
|
+
elsif consecutive_failures(logs) >= 3
|
|
264
|
+
"declining"
|
|
265
|
+
# ...
|
|
266
|
+
end
|
|
267
|
+
```
|
|
268
|
+
|
|
269
|
+
### Step 3: Add Configuration
|
|
270
|
+
|
|
271
|
+
If the rule needs a threshold, add it to `HealthSettings`:
|
|
272
|
+
|
|
273
|
+
```ruby
|
|
274
|
+
class HealthSettings
|
|
275
|
+
attr_accessor :my_threshold
|
|
276
|
+
def reset!
|
|
277
|
+
@my_threshold = 0.5 # default
|
|
278
|
+
# ...
|
|
279
|
+
end
|
|
280
|
+
end
|
|
281
|
+
```
|
|
282
|
+
|
|
283
|
+
### Step 4: Update Source Model
|
|
284
|
+
|
|
285
|
+
If adding a new status value, ensure views and helpers handle it.
|
|
286
|
+
|
|
287
|
+
### Step 5: Write Tests
|
|
288
|
+
|
|
289
|
+
```ruby
|
|
290
|
+
test "source enters my_custom_status when condition met" do
|
|
291
|
+
source = create_source!
|
|
292
|
+
# Create fetch logs that trigger the condition
|
|
293
|
+
monitor = SourceMonitor::Health::SourceHealthMonitor.new(source: source)
|
|
294
|
+
monitor.call
|
|
295
|
+
assert_equal "my_custom_status", source.reload.health_status
|
|
296
|
+
end
|
|
297
|
+
```
|
|
298
|
+
|
|
299
|
+
## Testing
|
|
300
|
+
|
|
301
|
+
- Health monitor tests: `test/lib/source_monitor/health/source_health_monitor_test.rb`
|
|
302
|
+
- Health check tests: `test/lib/source_monitor/health/source_health_check_test.rb`
|
|
303
|
+
- Health reset tests: `test/lib/source_monitor/health/source_health_reset_test.rb`
|
|
304
|
+
- Health module tests: `test/lib/source_monitor/health/health_module_test.rb`
|
|
305
|
+
- Health check job tests: `test/jobs/source_monitor/source_health_check_job_test.rb`
|
|
306
|
+
- Controller tests: `test/controllers/source_monitor/source_health_checks_controller_test.rb`
|
|
307
|
+
|
|
308
|
+
Use `PARALLEL_WORKERS=1` for single test files to avoid PG segfault.
|
|
309
|
+
|
|
310
|
+
## Checklist
|
|
311
|
+
|
|
312
|
+
- [ ] New health rules evaluate from `fetch_logs` (rolling window)
|
|
313
|
+
- [ ] Thresholds are configurable via `HealthSettings`
|
|
314
|
+
- [ ] Per-source overrides supported where appropriate
|
|
315
|
+
- [ ] Status transitions only fire when status actually changes
|
|
316
|
+
- [ ] Auto-pause cooldown prevents flapping
|
|
317
|
+
- [ ] Tests cover threshold boundaries and edge cases
|
|
318
|
+
- [ ] Health status values handled in views/helpers
|
|
319
|
+
|
|
320
|
+
## References
|
|
321
|
+
|
|
322
|
+
- `lib/source_monitor/health/` -- All health system code
|
|
323
|
+
- `lib/source_monitor/configuration/health_settings.rb` -- Configuration
|
|
324
|
+
- `app/models/source_monitor/source.rb` -- Source health fields
|
|
325
|
+
- `app/models/source_monitor/health_check_log.rb` -- Health check log model
|
|
326
|
+
- `app/jobs/source_monitor/source_health_check_job.rb` -- Background health check
|
|
327
|
+
- `test/lib/source_monitor/health/` -- Health system tests
|