source_monitor 0.6.0 → 0.7.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/.gitignore +7 -0
- data/.vbw-planning/ROADMAP.md +32 -0
- data/.vbw-planning/STATE.md +27 -0
- data/.vbw-planning/phases/01-aia-certificate-resolution/.context-dev.md +17 -0
- data/.vbw-planning/phases/01-aia-certificate-resolution/PLAN-01-SUMMARY.md +26 -0
- data/.vbw-planning/phases/01-aia-certificate-resolution/PLAN-01.md +71 -0
- data/.vbw-planning/phases/01-aia-certificate-resolution/PLAN-02-SUMMARY.md +16 -0
- data/.vbw-planning/phases/01-aia-certificate-resolution/PLAN-02.md +56 -0
- data/.vbw-planning/phases/01-aia-certificate-resolution/PLAN-03-SUMMARY.md +17 -0
- data/.vbw-planning/phases/01-aia-certificate-resolution/PLAN-03.md +98 -0
- data/CHANGELOG.md +14 -0
- data/Gemfile.lock +1 -1
- data/VERSION +1 -1
- data/lib/source_monitor/fetching/feed_fetcher/entry_processor.rb +5 -0
- data/lib/source_monitor/fetching/feed_fetcher/source_updater.rb +7 -4
- data/lib/source_monitor/fetching/feed_fetcher.rb +49 -3
- data/lib/source_monitor/items/item_creator.rb +31 -5
- data/lib/source_monitor/version.rb +1 -1
- metadata +10 -1
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: b65480547bf48a4cabf2d1c98dbd6c965a6b7342c3da362b3987b1bed3e59a5d
|
|
4
|
+
data.tar.gz: 775abb18c5c94b5cf11e78e01c296a618c7ef884cb328f4f5f886c2d144c2f75
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: f75e313708962d167d7b362ed4f8af42be433e28d5a9e1aa59f290d82ed12103800abaf2f904098211c351f7c3b9265af05363d2364f477e22af3e7de2bc9755
|
|
7
|
+
data.tar.gz: ab0e7911a85c744f632d2fad3dbeb671ddb55b3e1f4cc0ed3a0dbc859e2dad21ccacdec8aff7ee45445cf84f3cfb3ddaf2ba803a7e533cacdc14b8cb3ee61ab6
|
data/.gitignore
CHANGED
|
@@ -22,3 +22,10 @@
|
|
|
22
22
|
.vbw-planning/.claude-md-migrated
|
|
23
23
|
.vbw-planning/.watchdog-pid
|
|
24
24
|
.vbw-planning/.watchdog.log
|
|
25
|
+
.vbw-planning/.agent-pids
|
|
26
|
+
.vbw-planning/.agent-panes
|
|
27
|
+
.vbw-planning/.active-agent
|
|
28
|
+
.vbw-planning/.active-agent-count
|
|
29
|
+
.vbw-planning/.todo-flat-migrated
|
|
30
|
+
/codebase_analysis.md
|
|
31
|
+
*.gem
|
|
@@ -0,0 +1,32 @@
|
|
|
1
|
+
# Roadmap
|
|
2
|
+
|
|
3
|
+
## Milestone: aia-ssl-fix
|
|
4
|
+
|
|
5
|
+
### Phases
|
|
6
|
+
|
|
7
|
+
1. [x] **AIA Certificate Resolution** -- Fix SSL failures for feeds with missing intermediate certificates by implementing AIA (Authority Information Access) resolution
|
|
8
|
+
|
|
9
|
+
### Phase Details
|
|
10
|
+
|
|
11
|
+
#### Phase 1: AIA Certificate Resolution
|
|
12
|
+
|
|
13
|
+
**Goal:** Implement automatic AIA intermediate certificate fetching so feeds like netflixtechblog.com (served via Medium/AWS with wrong intermediates) succeed without manual cert configuration.
|
|
14
|
+
|
|
15
|
+
**Requirements:**
|
|
16
|
+
- REQ-AIA-01: Create AIAResolver module with thread-safe cache and 1-hour TTL
|
|
17
|
+
- REQ-AIA-02: Add cert_store: parameter to HTTP.client for custom cert stores
|
|
18
|
+
- REQ-AIA-03: On Faraday::SSLError, attempt AIA resolution before failing
|
|
19
|
+
- REQ-AIA-04: Best-effort only -- never make things worse (rescue StandardError -> nil)
|
|
20
|
+
|
|
21
|
+
**Success Criteria:**
|
|
22
|
+
- [ ] AIAResolver.resolve(hostname) fetches leaf cert, extracts AIA URL, downloads intermediate
|
|
23
|
+
- [ ] HTTP.client(cert_store:) accepts and uses custom cert stores
|
|
24
|
+
- [ ] FeedFetcher retries once with AIA-resolved cert store on SSL failure
|
|
25
|
+
- [ ] All existing tests pass (1003+), new tests cover AIA paths
|
|
26
|
+
- [ ] RuboCop zero offenses, Brakeman zero warnings
|
|
27
|
+
|
|
28
|
+
### Progress
|
|
29
|
+
|
|
30
|
+
| Phase | Status | Plans | Completed |
|
|
31
|
+
|-------|--------|-------|-----------|
|
|
32
|
+
| 1. AIA Certificate Resolution | Planned | 3 | 0 |
|
|
@@ -0,0 +1,27 @@
|
|
|
1
|
+
# State
|
|
2
|
+
|
|
3
|
+
## Current Position
|
|
4
|
+
|
|
5
|
+
- **Milestone:** aia-ssl-fix
|
|
6
|
+
- **Phase:** 1 -- AIA Certificate Resolution
|
|
7
|
+
- **Status:** Complete
|
|
8
|
+
- **Progress:** 100%
|
|
9
|
+
|
|
10
|
+
## Decisions
|
|
11
|
+
|
|
12
|
+
| Decision | Date | Context |
|
|
13
|
+
|----------|------|---------|
|
|
14
|
+
| Single-phase milestone for AIA fix | 2026-02-17 | Complete plan already validated; no scoping needed |
|
|
15
|
+
| 3 plans with wave parallelism | 2026-02-17 | Plans 01+02 (wave 1, disjoint files), Plan 03 (wave 2, integration) |
|
|
16
|
+
|
|
17
|
+
## Todos
|
|
18
|
+
|
|
19
|
+
## Metrics
|
|
20
|
+
|
|
21
|
+
- **Started:** 2026-02-17
|
|
22
|
+
- **Phases:** 1
|
|
23
|
+
- **Plans:** 3
|
|
24
|
+
- **Tests at start:** 1003
|
|
25
|
+
- **Tests at end:** 1025
|
|
26
|
+
- **Commits:** 4 (f60e9bf, 4c9568a, 9c38bc3, e68a6b0)
|
|
27
|
+
- **Plans completed:** 3/3
|
|
@@ -0,0 +1,17 @@
|
|
|
1
|
+
## Phase 1 Context
|
|
2
|
+
|
|
3
|
+
### Goal
|
|
4
|
+
Not available
|
|
5
|
+
|
|
6
|
+
### Codebase Map Available
|
|
7
|
+
Codebase mapping exists in `.vbw-planning/codebase/`. Key files:
|
|
8
|
+
- `ARCHITECTURE.md`
|
|
9
|
+
- `CONCERNS.md`
|
|
10
|
+
- `PATTERNS.md`
|
|
11
|
+
- `DEPENDENCIES.md`
|
|
12
|
+
- `STRUCTURE.md`
|
|
13
|
+
- `CONVENTIONS.md`
|
|
14
|
+
- `TESTING.md`
|
|
15
|
+
- `STACK.md`
|
|
16
|
+
|
|
17
|
+
Read CONVENTIONS.md, PATTERNS.md, STRUCTURE.md, and DEPENDENCIES.md first to bootstrap codebase understanding.
|
|
@@ -0,0 +1,26 @@
|
|
|
1
|
+
---
|
|
2
|
+
phase: 1
|
|
3
|
+
plan: 1
|
|
4
|
+
status: complete
|
|
5
|
+
---
|
|
6
|
+
# Plan 01 Summary: AIA Resolver Module
|
|
7
|
+
|
|
8
|
+
## Tasks Completed
|
|
9
|
+
- [x] Task 1: Created lib/source_monitor/http/aia_resolver.rb
|
|
10
|
+
- [x] Task 2: Created test/lib/source_monitor/http/aia_resolver_test.rb
|
|
11
|
+
|
|
12
|
+
## Commits
|
|
13
|
+
- 4c9568a: feat(1-1): add AIA intermediate certificate resolver
|
|
14
|
+
|
|
15
|
+
## Files Modified
|
|
16
|
+
- lib/source_monitor/http/aia_resolver.rb (created)
|
|
17
|
+
- test/lib/source_monitor/http/aia_resolver_test.rb (created)
|
|
18
|
+
|
|
19
|
+
## What Was Built
|
|
20
|
+
- `SourceMonitor::HTTP::AIAResolver` module with thread-safe cached resolution of missing intermediate SSL certificates via AIA (Authority Information Access) X.509 extension
|
|
21
|
+
- Public API: `resolve(hostname)`, `enhanced_cert_store(certs)`, `clear_cache!`, `cache_size`
|
|
22
|
+
- Private methods: `fetch_leaf_certificate` (VERIFY_NONE + SNI), `extract_aia_url` (uses `cert.ca_issuer_uris`), `download_certificate` (DER-first, PEM fallback)
|
|
23
|
+
- 11 unit tests covering all public/private methods, caching, TTL expiration, and error handling
|
|
24
|
+
|
|
25
|
+
## Deviations
|
|
26
|
+
- None
|
|
@@ -0,0 +1,71 @@
|
|
|
1
|
+
---
|
|
2
|
+
phase: 1
|
|
3
|
+
plan: 1
|
|
4
|
+
title: "AIA Resolver Module"
|
|
5
|
+
wave: 1
|
|
6
|
+
depends_on: []
|
|
7
|
+
must_haves:
|
|
8
|
+
- AIAResolver module with resolve, enhanced_cert_store, clear_cache!, cache_size
|
|
9
|
+
- Thread-safe Mutex + Hash cache with 1-hour TTL per hostname
|
|
10
|
+
- fetch_leaf_certificate with VERIFY_NONE and SNI support
|
|
11
|
+
- extract_aia_url using cert.ca_issuer_uris (not regex)
|
|
12
|
+
- download_certificate with DER-first, PEM-fallback parsing
|
|
13
|
+
- All methods rescue StandardError and return nil
|
|
14
|
+
- Unit tests covering all public and private methods
|
|
15
|
+
---
|
|
16
|
+
|
|
17
|
+
# Plan 01: AIA Resolver Module
|
|
18
|
+
|
|
19
|
+
## Goal
|
|
20
|
+
|
|
21
|
+
Create `SourceMonitor::HTTP::AIAResolver` -- a standalone module that resolves missing intermediate certificates via the AIA (Authority Information Access) extension in X.509 certificates.
|
|
22
|
+
|
|
23
|
+
## Tasks
|
|
24
|
+
|
|
25
|
+
### Task 1: Create lib/source_monitor/http/aia_resolver.rb
|
|
26
|
+
|
|
27
|
+
Create new module `SourceMonitor::HTTP::AIAResolver` with class methods:
|
|
28
|
+
|
|
29
|
+
**Public API:**
|
|
30
|
+
- `resolve(hostname, port: 443)` -- Entry point. Checks cache first, then: fetch leaf cert -> extract AIA URL -> download intermediate. Returns `OpenSSL::X509::Certificate` or `nil`.
|
|
31
|
+
- `enhanced_cert_store(additional_certs)` -- Builds `OpenSSL::X509::Store` with `set_default_paths` plus extra certs from the array.
|
|
32
|
+
- `clear_cache!` -- Clears the hostname cache (for testing).
|
|
33
|
+
- `cache_size` -- Returns number of cached entries (for testing).
|
|
34
|
+
|
|
35
|
+
**Private methods:**
|
|
36
|
+
- `fetch_leaf_certificate(hostname, port)` -- TCP+SSL connect with `VERIFY_NONE` to get the server's leaf cert. 5s connect timeout. Uses `ssl_socket.hostname=` for SNI.
|
|
37
|
+
- `extract_aia_url(cert)` -- Uses Ruby's built-in `cert.ca_issuer_uris` method. Returns first URI string or nil.
|
|
38
|
+
- `download_certificate(url)` -- Plain HTTP GET (AIA URLs are always HTTP, not HTTPS). 5s timeout. Parses DER body as `OpenSSL::X509::Certificate`, falls back to PEM on failure.
|
|
39
|
+
|
|
40
|
+
**Cache:** `Mutex` + `Hash` keyed by hostname. Each entry stores `{ cert:, expires_at: }` with 1-hour TTL.
|
|
41
|
+
|
|
42
|
+
**Safety:** All methods rescue `StandardError` and return `nil`. This is best-effort -- never makes things worse.
|
|
43
|
+
|
|
44
|
+
### Task 2: Create test/lib/source_monitor/http/aia_resolver_test.rb
|
|
45
|
+
|
|
46
|
+
Unit tests:
|
|
47
|
+
- `extract_aia_url` with cert that has AIA extension returns URL
|
|
48
|
+
- `extract_aia_url` with cert without AIA returns nil
|
|
49
|
+
- `download_certificate` with DER body parses correctly (WebMock stub)
|
|
50
|
+
- `download_certificate` returns nil on HTTP 404 (WebMock)
|
|
51
|
+
- `download_certificate` returns nil on timeout (WebMock)
|
|
52
|
+
- `enhanced_cert_store` returns store with added certs
|
|
53
|
+
- `enhanced_cert_store` handles empty array gracefully
|
|
54
|
+
- Cache: resolve stores result, second call returns cached
|
|
55
|
+
- Cache: expired entries are re-fetched
|
|
56
|
+
- `clear_cache!` empties the cache
|
|
57
|
+
- `resolve` returns nil when hostname unreachable (stub fetch_leaf_certificate)
|
|
58
|
+
|
|
59
|
+
## Files
|
|
60
|
+
|
|
61
|
+
| Action | Path |
|
|
62
|
+
|--------|------|
|
|
63
|
+
| CREATE | `lib/source_monitor/http/aia_resolver.rb` |
|
|
64
|
+
| CREATE | `test/lib/source_monitor/http/aia_resolver_test.rb` |
|
|
65
|
+
|
|
66
|
+
## Verification
|
|
67
|
+
|
|
68
|
+
```bash
|
|
69
|
+
PARALLEL_WORKERS=1 bin/rails test test/lib/source_monitor/http/aia_resolver_test.rb
|
|
70
|
+
bin/rubocop lib/source_monitor/http/aia_resolver.rb test/lib/source_monitor/http/aia_resolver_test.rb
|
|
71
|
+
```
|
|
@@ -0,0 +1,16 @@
|
|
|
1
|
+
---
|
|
2
|
+
phase: 1
|
|
3
|
+
plan: 2
|
|
4
|
+
status: complete
|
|
5
|
+
commit: f60e9bf
|
|
6
|
+
---
|
|
7
|
+
|
|
8
|
+
## What Was Built
|
|
9
|
+
- Added `cert_store:` keyword parameter to `HTTP.client` for custom OpenSSL cert stores
|
|
10
|
+
- Added `autoload :AIAResolver` to HTTP module
|
|
11
|
+
- Plumbed cert_store through `configure_request` -> `configure_ssl` with fallback to `default_cert_store`
|
|
12
|
+
- 2 new tests: custom cert_store usage, ssl_ca_file takes precedence over cert_store
|
|
13
|
+
|
|
14
|
+
## Files Modified
|
|
15
|
+
- `lib/source_monitor/http.rb` — autoload, cert_store param, SSL plumbing
|
|
16
|
+
- `test/lib/source_monitor/http_test.rb` — 2 new cert_store tests
|
|
@@ -0,0 +1,56 @@
|
|
|
1
|
+
---
|
|
2
|
+
phase: 1
|
|
3
|
+
plan: 2
|
|
4
|
+
title: "HTTP Module cert_store Parameter"
|
|
5
|
+
wave: 1
|
|
6
|
+
depends_on: []
|
|
7
|
+
must_haves:
|
|
8
|
+
- Add autoload :AIAResolver to module HTTP
|
|
9
|
+
- Add cert_store keyword to client method
|
|
10
|
+
- Pass cert_store through configure_request to configure_ssl
|
|
11
|
+
- configure_ssl uses cert_store when no ssl_ca_file/ssl_ca_path
|
|
12
|
+
- Tests for cert_store parameter usage
|
|
13
|
+
---
|
|
14
|
+
|
|
15
|
+
# Plan 02: HTTP Module cert_store Parameter
|
|
16
|
+
|
|
17
|
+
## Goal
|
|
18
|
+
|
|
19
|
+
Extend `SourceMonitor::HTTP.client` to accept an optional `cert_store:` parameter, enabling callers (like FeedFetcher's AIA retry) to provide a custom `OpenSSL::X509::Store` with additional certificates.
|
|
20
|
+
|
|
21
|
+
## Tasks
|
|
22
|
+
|
|
23
|
+
### Task 1: Modify lib/source_monitor/http.rb
|
|
24
|
+
|
|
25
|
+
1. Add autoload inside `module HTTP` (after RETRY_STATUSES):
|
|
26
|
+
```ruby
|
|
27
|
+
autoload :AIAResolver, "source_monitor/http/aia_resolver"
|
|
28
|
+
```
|
|
29
|
+
|
|
30
|
+
2. Add `cert_store: nil` keyword to `client` method signature.
|
|
31
|
+
|
|
32
|
+
3. Pass `cert_store:` through `configure_request` to `configure_ssl`:
|
|
33
|
+
- Add `cert_store:` parameter to `configure_request`
|
|
34
|
+
- Pass it to `configure_ssl(connection, settings, cert_store:)`
|
|
35
|
+
|
|
36
|
+
4. In `configure_ssl`: when no `ssl_ca_file` or `ssl_ca_path` is set, use `cert_store || default_cert_store`.
|
|
37
|
+
|
|
38
|
+
### Task 2: Add tests to test/lib/source_monitor/http_test.rb
|
|
39
|
+
|
|
40
|
+
Add 2 tests:
|
|
41
|
+
- `cert_store: param is used when no ssl_ca_file or ssl_ca_path` -- pass a custom store, verify `connection.ssl.cert_store` is the custom store
|
|
42
|
+
- `cert_store: is ignored when ssl_ca_file is set` -- configure ssl_ca_file, pass cert_store, verify ca_file takes precedence
|
|
43
|
+
|
|
44
|
+
## Files
|
|
45
|
+
|
|
46
|
+
| Action | Path |
|
|
47
|
+
|--------|------|
|
|
48
|
+
| MODIFY | `lib/source_monitor/http.rb` |
|
|
49
|
+
| MODIFY | `test/lib/source_monitor/http_test.rb` |
|
|
50
|
+
|
|
51
|
+
## Verification
|
|
52
|
+
|
|
53
|
+
```bash
|
|
54
|
+
PARALLEL_WORKERS=1 bin/rails test test/lib/source_monitor/http_test.rb
|
|
55
|
+
bin/rubocop lib/source_monitor/http.rb test/lib/source_monitor/http_test.rb
|
|
56
|
+
```
|
|
@@ -0,0 +1,17 @@
|
|
|
1
|
+
---
|
|
2
|
+
phase: 1
|
|
3
|
+
plan: 3
|
|
4
|
+
status: complete
|
|
5
|
+
commit: 9c38bc3
|
|
6
|
+
---
|
|
7
|
+
|
|
8
|
+
## What Was Built
|
|
9
|
+
- Wired AIA certificate resolution into FeedFetcher's SSL error handling
|
|
10
|
+
- On `Faraday::SSLError`, attempts intermediate cert recovery via `AIAResolver.resolve` before raising
|
|
11
|
+
- Guard flag `@aia_attempted` prevents infinite recursion; `rescue StandardError => nil` ensures recovery never makes things worse
|
|
12
|
+
- Tags `instrumentation_payload[:aia_resolved] = true` on successful AIA recovery
|
|
13
|
+
- 3 integration tests: success retry path, nil fallback to ConnectionError, non-SSL skip
|
|
14
|
+
|
|
15
|
+
## Files Modified
|
|
16
|
+
- `lib/source_monitor/fetching/feed_fetcher.rb` — split SSL rescue, add `attempt_aia_recovery`
|
|
17
|
+
- `test/lib/source_monitor/fetching/feed_fetcher_test.rb` — 3 AIA resolution tests
|
|
@@ -0,0 +1,98 @@
|
|
|
1
|
+
---
|
|
2
|
+
phase: 1
|
|
3
|
+
plan: 3
|
|
4
|
+
title: "FeedFetcher AIA Retry Integration"
|
|
5
|
+
wave: 2
|
|
6
|
+
depends_on: [1, 2]
|
|
7
|
+
must_haves:
|
|
8
|
+
- Separate Faraday::SSLError rescue from Faraday::ConnectionFailed
|
|
9
|
+
- On SSLError attempt AIA resolution once (aia_attempted flag)
|
|
10
|
+
- Parse hostname from source.feed_url for AIA resolve
|
|
11
|
+
- If intermediate found rebuild connection with enhanced cert store and retry
|
|
12
|
+
- If nil raise ConnectionError as before
|
|
13
|
+
- Tag successful recoveries with aia_resolved in instrumentation
|
|
14
|
+
- Integration tests for all AIA retry paths
|
|
15
|
+
- Full test suite passes (1003+ tests)
|
|
16
|
+
- RuboCop zero offenses
|
|
17
|
+
- Brakeman zero warnings
|
|
18
|
+
---
|
|
19
|
+
|
|
20
|
+
# Plan 03: FeedFetcher AIA Retry Integration
|
|
21
|
+
|
|
22
|
+
## Goal
|
|
23
|
+
|
|
24
|
+
Wire AIA resolution into FeedFetcher's error handling so SSL failures automatically attempt intermediate certificate recovery before giving up.
|
|
25
|
+
|
|
26
|
+
## Tasks
|
|
27
|
+
|
|
28
|
+
### Task 1: Modify lib/source_monitor/fetching/feed_fetcher.rb
|
|
29
|
+
|
|
30
|
+
Modify `perform_fetch` (lines 77-90):
|
|
31
|
+
|
|
32
|
+
1. **Split rescue clause:** Separate `Faraday::SSLError` from `Faraday::ConnectionFailed` into its own rescue:
|
|
33
|
+
```ruby
|
|
34
|
+
rescue Faraday::ConnectionFailed => error
|
|
35
|
+
raise ConnectionError.new(error.message, original_error: error)
|
|
36
|
+
rescue Faraday::SSLError => error
|
|
37
|
+
attempt_aia_recovery(error) || raise(ConnectionError.new(error.message, original_error: error))
|
|
38
|
+
```
|
|
39
|
+
|
|
40
|
+
2. **Add `attempt_aia_recovery` private method:**
|
|
41
|
+
- Guard: return nil if `@aia_attempted` is true (prevents recursion)
|
|
42
|
+
- Set `@aia_attempted = true`
|
|
43
|
+
- Parse hostname from `URI.parse(source.feed_url).host`
|
|
44
|
+
- Call `SourceMonitor::HTTP::AIAResolver.resolve(hostname)`
|
|
45
|
+
- If intermediate found:
|
|
46
|
+
- Build enhanced cert store via `AIAResolver.enhanced_cert_store([intermediate])`
|
|
47
|
+
- Rebuild `@connection = SourceMonitor::HTTP.client(cert_store: store, headers: request_headers)`
|
|
48
|
+
- Return `perform_request` (the retry)
|
|
49
|
+
- If nil: return nil (caller raises ConnectionError)
|
|
50
|
+
- Rescue StandardError -> nil (never make retry worse)
|
|
51
|
+
|
|
52
|
+
3. **Tag instrumentation:** In the `handle_response` path after successful AIA retry, the `instrumentation_payload[:aia_resolved] = true` will naturally flow through since `perform_fetch` calls `handle_response` on the retried response.
|
|
53
|
+
|
|
54
|
+
### Task 2: Add tests to test/lib/source_monitor/fetching/feed_fetcher_test.rb
|
|
55
|
+
|
|
56
|
+
Add 3 tests under a new section `# -- AIA Certificate Resolution --`:
|
|
57
|
+
|
|
58
|
+
1. **SSL error + AIA resolve succeeds -> fetch succeeds:**
|
|
59
|
+
- First stub: raise `Faraday::SSLError`
|
|
60
|
+
- Stub `AIAResolver.resolve` to return a mock certificate
|
|
61
|
+
- Stub `AIAResolver.enhanced_cert_store` to return a store
|
|
62
|
+
- Second stub (after retry): return 200 with RSS body
|
|
63
|
+
- Assert result.status == :fetched
|
|
64
|
+
|
|
65
|
+
2. **SSL error + AIA resolve returns nil -> ConnectionError:**
|
|
66
|
+
- Stub to raise `Faraday::SSLError`
|
|
67
|
+
- Stub `AIAResolver.resolve` to return nil
|
|
68
|
+
- Assert result.status == :failed
|
|
69
|
+
- Assert result.error is ConnectionError
|
|
70
|
+
|
|
71
|
+
3. **Non-SSL ConnectionError -> AIA not attempted:**
|
|
72
|
+
- Stub to raise `Faraday::ConnectionFailed`
|
|
73
|
+
- Verify `AIAResolver.resolve` was NOT called
|
|
74
|
+
- Assert result.status == :failed
|
|
75
|
+
- Assert result.error is ConnectionError
|
|
76
|
+
|
|
77
|
+
### Task 3: Run full verification
|
|
78
|
+
|
|
79
|
+
1. `PARALLEL_WORKERS=1 bin/rails test test/lib/source_monitor/fetching/feed_fetcher_test.rb`
|
|
80
|
+
2. `bin/rails test` (full suite)
|
|
81
|
+
3. `bin/rubocop`
|
|
82
|
+
4. `bin/brakeman --no-pager`
|
|
83
|
+
|
|
84
|
+
## Files
|
|
85
|
+
|
|
86
|
+
| Action | Path |
|
|
87
|
+
|--------|------|
|
|
88
|
+
| MODIFY | `lib/source_monitor/fetching/feed_fetcher.rb` |
|
|
89
|
+
| MODIFY | `test/lib/source_monitor/fetching/feed_fetcher_test.rb` |
|
|
90
|
+
|
|
91
|
+
## Verification
|
|
92
|
+
|
|
93
|
+
```bash
|
|
94
|
+
PARALLEL_WORKERS=1 bin/rails test test/lib/source_monitor/fetching/feed_fetcher_test.rb
|
|
95
|
+
bin/rails test
|
|
96
|
+
bin/rubocop
|
|
97
|
+
bin/brakeman --no-pager
|
|
98
|
+
```
|
data/CHANGELOG.md
CHANGED
|
@@ -15,6 +15,20 @@ All notable changes to this project are documented below. The format follows [Ke
|
|
|
15
15
|
|
|
16
16
|
- No unreleased changes yet.
|
|
17
17
|
|
|
18
|
+
## [0.7.0] - 2026-02-18
|
|
19
|
+
|
|
20
|
+
### Fixed
|
|
21
|
+
|
|
22
|
+
- **False "updated" counts on unchanged feed items.** ItemCreator now checks for significant attribute changes before saving. Items with no real changes return a new `:unchanged` status instead of `:updated`, eliminating unnecessary database writes and misleading dashboard statistics.
|
|
23
|
+
- **Redundant entry processing on unchanged feeds.** When a feed's body SHA-256 signature matches the previous fetch, entry processing is now skipped entirely (like the existing 304 Not Modified path), avoiding unnecessary parsing, DB lookups, and saves.
|
|
24
|
+
- **Adaptive interval not backing off for stable feeds.** The `content_changed` signal for adaptive fetch scheduling now uses an item-level content hash (sorted entry IDs) instead of the raw XML body hash. This prevents cosmetic feed changes (e.g., `<lastBuildDate>` updates) from defeating interval backoff, allowing stable feeds to correctly increase their fetch interval.
|
|
25
|
+
|
|
26
|
+
### Testing
|
|
27
|
+
|
|
28
|
+
- 1,031 tests, 3,300 assertions, 0 failures.
|
|
29
|
+
- RuboCop: 0 offenses.
|
|
30
|
+
- Brakeman: 0 warnings.
|
|
31
|
+
|
|
18
32
|
## [0.6.0] - 2026-02-17
|
|
19
33
|
|
|
20
34
|
### Added
|
data/Gemfile.lock
CHANGED
data/VERSION
CHANGED
|
@@ -1 +1 @@
|
|
|
1
|
-
0.
|
|
1
|
+
0.7.0
|
|
@@ -14,6 +14,7 @@ module SourceMonitor
|
|
|
14
14
|
return FeedFetcher::EntryProcessingResult.new(
|
|
15
15
|
created: 0,
|
|
16
16
|
updated: 0,
|
|
17
|
+
unchanged: 0,
|
|
17
18
|
failed: 0,
|
|
18
19
|
items: [],
|
|
19
20
|
errors: [],
|
|
@@ -23,6 +24,7 @@ module SourceMonitor
|
|
|
23
24
|
|
|
24
25
|
created = 0
|
|
25
26
|
updated = 0
|
|
27
|
+
unchanged = 0
|
|
26
28
|
failed = 0
|
|
27
29
|
items = []
|
|
28
30
|
created_items = []
|
|
@@ -39,6 +41,8 @@ module SourceMonitor
|
|
|
39
41
|
created_items << result.item
|
|
40
42
|
SourceMonitor::Events.after_item_created(item: result.item, source:, entry:, result: result)
|
|
41
43
|
enqueue_image_download(result.item)
|
|
44
|
+
elsif result.unchanged?
|
|
45
|
+
unchanged += 1
|
|
42
46
|
else
|
|
43
47
|
updated += 1
|
|
44
48
|
updated_items << result.item
|
|
@@ -52,6 +56,7 @@ module SourceMonitor
|
|
|
52
56
|
FeedFetcher::EntryProcessingResult.new(
|
|
53
57
|
created:,
|
|
54
58
|
updated:,
|
|
59
|
+
unchanged:,
|
|
55
60
|
failed:,
|
|
56
61
|
items:,
|
|
57
62
|
errors: errors.compact,
|
|
@@ -11,7 +11,7 @@ module SourceMonitor
|
|
|
11
11
|
@adaptive_interval = adaptive_interval
|
|
12
12
|
end
|
|
13
13
|
|
|
14
|
-
def update_source_for_success(response, duration_ms, feed, feed_signature)
|
|
14
|
+
def update_source_for_success(response, duration_ms, feed, feed_signature, content_changed: nil, entries_digest: nil)
|
|
15
15
|
attributes = {
|
|
16
16
|
last_fetched_at: Time.current,
|
|
17
17
|
last_fetch_duration_ms: duration_ms,
|
|
@@ -31,8 +31,10 @@ module SourceMonitor
|
|
|
31
31
|
attributes[:last_modified] = parsed_time if parsed_time
|
|
32
32
|
end
|
|
33
33
|
|
|
34
|
-
|
|
35
|
-
|
|
34
|
+
# Use explicit content_changed if provided, otherwise fall back to feed signature comparison
|
|
35
|
+
changed = content_changed.nil? ? feed_signature_changed?(feed_signature) : content_changed
|
|
36
|
+
adaptive_interval.apply_adaptive_interval!(attributes, content_changed: changed)
|
|
37
|
+
attributes[:metadata] = updated_metadata(feed_signature: feed_signature, entries_digest: entries_digest)
|
|
36
38
|
reset_retry_state!(attributes)
|
|
37
39
|
source.update!(attributes)
|
|
38
40
|
end
|
|
@@ -111,10 +113,11 @@ module SourceMonitor
|
|
|
111
113
|
(source.metadata || {}).fetch("last_feed_signature", nil) != feed_signature
|
|
112
114
|
end
|
|
113
115
|
|
|
114
|
-
def updated_metadata(feed_signature: nil)
|
|
116
|
+
def updated_metadata(feed_signature: nil, entries_digest: nil)
|
|
115
117
|
metadata = (source.metadata || {}).dup
|
|
116
118
|
metadata.delete("dynamic_fetch_interval_seconds")
|
|
117
119
|
metadata["last_feed_signature"] = feed_signature if feed_signature.present?
|
|
120
|
+
metadata["last_entries_digest"] = entries_digest if entries_digest.present?
|
|
118
121
|
metadata
|
|
119
122
|
end
|
|
120
123
|
|
|
@@ -17,6 +17,7 @@ module SourceMonitor
|
|
|
17
17
|
EntryProcessingResult = Struct.new(
|
|
18
18
|
:created,
|
|
19
19
|
:updated,
|
|
20
|
+
:unchanged,
|
|
20
21
|
:failed,
|
|
21
22
|
:items,
|
|
22
23
|
:errors,
|
|
@@ -123,11 +124,28 @@ module SourceMonitor
|
|
|
123
124
|
def handle_success(response, started_at, instrumentation_payload)
|
|
124
125
|
duration_ms = source_updater.elapsed_ms(started_at)
|
|
125
126
|
body = response.body
|
|
127
|
+
feed_body_signature = body_digest(body)
|
|
126
128
|
feed = parse_feed(body, response)
|
|
127
|
-
processing = entry_processor.process_feed_entries(feed)
|
|
128
129
|
|
|
129
|
-
|
|
130
|
-
|
|
130
|
+
if source_updater.feed_signature_changed?(feed_body_signature)
|
|
131
|
+
processing = entry_processor.process_feed_entries(feed)
|
|
132
|
+
content_changed = entries_digest_changed?(feed)
|
|
133
|
+
else
|
|
134
|
+
processing = EntryProcessingResult.new(
|
|
135
|
+
created: 0,
|
|
136
|
+
updated: 0,
|
|
137
|
+
unchanged: 0,
|
|
138
|
+
failed: 0,
|
|
139
|
+
items: [],
|
|
140
|
+
errors: [],
|
|
141
|
+
created_items: [],
|
|
142
|
+
updated_items: []
|
|
143
|
+
)
|
|
144
|
+
content_changed = false
|
|
145
|
+
end
|
|
146
|
+
|
|
147
|
+
feed_entries_digest = entries_digest(feed)
|
|
148
|
+
source_updater.update_source_for_success(response, duration_ms, feed, feed_body_signature, content_changed: content_changed, entries_digest: feed_entries_digest)
|
|
131
149
|
source_updater.create_fetch_log(
|
|
132
150
|
response: response,
|
|
133
151
|
duration_ms: duration_ms,
|
|
@@ -180,6 +198,7 @@ module SourceMonitor
|
|
|
180
198
|
item_processing: EntryProcessingResult.new(
|
|
181
199
|
created: 0,
|
|
182
200
|
updated: 0,
|
|
201
|
+
unchanged: 0,
|
|
183
202
|
failed: 0,
|
|
184
203
|
items: [],
|
|
185
204
|
errors: [],
|
|
@@ -230,6 +249,7 @@ module SourceMonitor
|
|
|
230
249
|
item_processing: EntryProcessingResult.new(
|
|
231
250
|
created: 0,
|
|
232
251
|
updated: 0,
|
|
252
|
+
unchanged: 0,
|
|
233
253
|
failed: 0,
|
|
234
254
|
items: [],
|
|
235
255
|
errors: [],
|
|
@@ -277,6 +297,32 @@ module SourceMonitor
|
|
|
277
297
|
Digest::SHA256.hexdigest(body)
|
|
278
298
|
end
|
|
279
299
|
|
|
300
|
+
def entries_digest(feed)
|
|
301
|
+
return if feed.nil? || !feed.respond_to?(:entries)
|
|
302
|
+
|
|
303
|
+
ids = Array(feed.entries).map do |entry|
|
|
304
|
+
if entry.respond_to?(:entry_id) && entry.entry_id.present?
|
|
305
|
+
entry.entry_id
|
|
306
|
+
elsif entry.respond_to?(:url) && entry.url.present?
|
|
307
|
+
entry.url
|
|
308
|
+
elsif entry.respond_to?(:title) && entry.title.present?
|
|
309
|
+
entry.title
|
|
310
|
+
end
|
|
311
|
+
end.compact.sort
|
|
312
|
+
|
|
313
|
+
return if ids.empty?
|
|
314
|
+
|
|
315
|
+
Digest::SHA256.hexdigest(ids.join("\0"))
|
|
316
|
+
end
|
|
317
|
+
|
|
318
|
+
def entries_digest_changed?(feed)
|
|
319
|
+
digest = entries_digest(feed)
|
|
320
|
+
return false if digest.nil?
|
|
321
|
+
|
|
322
|
+
stored = (source.metadata || {}).fetch("last_entries_digest", nil)
|
|
323
|
+
stored != digest
|
|
324
|
+
end
|
|
325
|
+
|
|
280
326
|
def adaptive_interval
|
|
281
327
|
@adaptive_interval ||= AdaptiveInterval.new(source: source, jitter_proc: jitter_proc)
|
|
282
328
|
end
|
|
@@ -21,6 +21,10 @@ module SourceMonitor
|
|
|
21
21
|
def updated?
|
|
22
22
|
status == :updated
|
|
23
23
|
end
|
|
24
|
+
|
|
25
|
+
def unchanged?
|
|
26
|
+
status == :unchanged
|
|
27
|
+
end
|
|
24
28
|
end
|
|
25
29
|
|
|
26
30
|
FINGERPRINT_SEPARATOR = "\u0000".freeze
|
|
@@ -46,8 +50,15 @@ module SourceMonitor
|
|
|
46
50
|
existing_item, matched_by = existing_item_for(attributes, raw_guid_present: raw_guid.present?)
|
|
47
51
|
|
|
48
52
|
if existing_item
|
|
49
|
-
|
|
50
|
-
|
|
53
|
+
apply_attributes(existing_item, attributes)
|
|
54
|
+
instrument_duplicate(existing_item, matched_by)
|
|
55
|
+
if significant_changes?(existing_item)
|
|
56
|
+
existing_item.save!
|
|
57
|
+
return Result.new(item: existing_item, status: :updated, matched_by: matched_by)
|
|
58
|
+
else
|
|
59
|
+
existing_item.reload if existing_item.changed?
|
|
60
|
+
return Result.new(item: existing_item, status: :unchanged, matched_by: matched_by)
|
|
61
|
+
end
|
|
51
62
|
end
|
|
52
63
|
|
|
53
64
|
create_new_item(attributes, raw_guid_present: raw_guid.present?)
|
|
@@ -100,7 +111,7 @@ module SourceMonitor
|
|
|
100
111
|
|
|
101
112
|
def update_existing_item(existing_item, attributes, matched_by)
|
|
102
113
|
apply_attributes(existing_item, attributes)
|
|
103
|
-
existing_item.save!
|
|
114
|
+
existing_item.save! if significant_changes?(existing_item)
|
|
104
115
|
instrument_duplicate(existing_item, matched_by)
|
|
105
116
|
existing_item
|
|
106
117
|
end
|
|
@@ -117,8 +128,15 @@ module SourceMonitor
|
|
|
117
128
|
def handle_concurrent_duplicate(attributes, raw_guid_present:)
|
|
118
129
|
matched_by = raw_guid_present ? :guid : :fingerprint
|
|
119
130
|
existing = find_conflicting_item(attributes, matched_by)
|
|
120
|
-
|
|
121
|
-
|
|
131
|
+
apply_attributes(existing, attributes)
|
|
132
|
+
instrument_duplicate(existing, matched_by)
|
|
133
|
+
if significant_changes?(existing)
|
|
134
|
+
existing.save!
|
|
135
|
+
Result.new(item: existing, status: :updated, matched_by: matched_by)
|
|
136
|
+
else
|
|
137
|
+
existing.reload if existing.changed?
|
|
138
|
+
Result.new(item: existing, status: :unchanged, matched_by: matched_by)
|
|
139
|
+
end
|
|
122
140
|
end
|
|
123
141
|
|
|
124
142
|
def find_conflicting_item(attributes, matched_by)
|
|
@@ -131,6 +149,10 @@ module SourceMonitor
|
|
|
131
149
|
end
|
|
132
150
|
end
|
|
133
151
|
|
|
152
|
+
# Attributes that should not trigger an "updated" status when they change.
|
|
153
|
+
# Metadata contains feedjira object references that differ between parses.
|
|
154
|
+
IGNORED_CHANGE_ATTRIBUTES = %w[metadata].freeze
|
|
155
|
+
|
|
134
156
|
def apply_attributes(record, attributes)
|
|
135
157
|
attributes = attributes.dup
|
|
136
158
|
metadata = attributes.delete(:metadata)
|
|
@@ -138,6 +160,10 @@ module SourceMonitor
|
|
|
138
160
|
record.metadata = metadata if metadata
|
|
139
161
|
end
|
|
140
162
|
|
|
163
|
+
def significant_changes?(record)
|
|
164
|
+
(record.changed - IGNORED_CHANGE_ATTRIBUTES).any?
|
|
165
|
+
end
|
|
166
|
+
|
|
141
167
|
def build_attributes
|
|
142
168
|
entry_parser.parse
|
|
143
169
|
end
|
metadata
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
|
2
2
|
name: source_monitor
|
|
3
3
|
version: !ruby/object:Gem::Version
|
|
4
|
-
version: 0.
|
|
4
|
+
version: 0.7.0
|
|
5
5
|
platform: ruby
|
|
6
6
|
authors:
|
|
7
7
|
- dchuk
|
|
@@ -343,7 +343,9 @@ files:
|
|
|
343
343
|
- ".rubocop.yml"
|
|
344
344
|
- ".ruby-version"
|
|
345
345
|
- ".vbw-planning/PROJECT.md"
|
|
346
|
+
- ".vbw-planning/ROADMAP.md"
|
|
346
347
|
- ".vbw-planning/SHIPPED.md"
|
|
348
|
+
- ".vbw-planning/STATE.md"
|
|
347
349
|
- ".vbw-planning/codebase/ARCHITECTURE.md"
|
|
348
350
|
- ".vbw-planning/codebase/CONCERNS.md"
|
|
349
351
|
- ".vbw-planning/codebase/CONVENTIONS.md"
|
|
@@ -425,6 +427,13 @@ files:
|
|
|
425
427
|
- ".vbw-planning/milestones/upgrade-assurance/phases/03-upgrade-skill-docs/03-VERIFICATION.md"
|
|
426
428
|
- ".vbw-planning/milestones/upgrade-assurance/phases/03-upgrade-skill-docs/PLAN-01-SUMMARY.md"
|
|
427
429
|
- ".vbw-planning/milestones/upgrade-assurance/phases/03-upgrade-skill-docs/PLAN-01.md"
|
|
430
|
+
- ".vbw-planning/phases/01-aia-certificate-resolution/.context-dev.md"
|
|
431
|
+
- ".vbw-planning/phases/01-aia-certificate-resolution/PLAN-01-SUMMARY.md"
|
|
432
|
+
- ".vbw-planning/phases/01-aia-certificate-resolution/PLAN-01.md"
|
|
433
|
+
- ".vbw-planning/phases/01-aia-certificate-resolution/PLAN-02-SUMMARY.md"
|
|
434
|
+
- ".vbw-planning/phases/01-aia-certificate-resolution/PLAN-02.md"
|
|
435
|
+
- ".vbw-planning/phases/01-aia-certificate-resolution/PLAN-03-SUMMARY.md"
|
|
436
|
+
- ".vbw-planning/phases/01-aia-certificate-resolution/PLAN-03.md"
|
|
428
437
|
- AGENTS.md
|
|
429
438
|
- CHANGELOG.md
|
|
430
439
|
- CLAUDE.md
|