quonfig 1.0.0 → 1.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +7 -0
- data/lib/quonfig/client.rb +67 -0
- data/lib/quonfig/config_loader.rb +291 -43
- data/lib/quonfig/http_connection.rb +18 -1
- data/lib/quonfig/options.rb +45 -1
- data/lib/quonfig/sse_config_client.rb +14 -0
- data/lib/quonfig/version.rb +1 -1
- metadata +2 -2
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: d12e38af54000d7cfeb34f7c3f50739527e63448668abf9d6fd4b436b27b277f
|
|
4
|
+
data.tar.gz: 1421bc6f53246ee1260a5c832ef342faa5ca70b6854acba6f12d9b51a464e99d
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: 73d25bb37c6c30e1d1b756172e0ffd4885bf901c9b600eec3c060f06bead7b45cf74e09544908d925a4345fcb5f9b6efe6d8331d7f8544079ffabf7b9d5b1228
|
|
7
|
+
data.tar.gz: 04df7b6f24efb0f9a3c30c47e32c686f2407ac54c1776eb7a2313fe1332f8339eac4693fc49edb49d76e00b7521680701bedfbfae5839b4d9158b3ace473efbc
|
data/CHANGELOG.md
CHANGED
|
@@ -1,5 +1,12 @@
|
|
|
1
1
|
# Changelog
|
|
2
2
|
|
|
3
|
+
## 1.1.0 - 2026-07-01
|
|
4
|
+
|
|
5
|
+
- **Feat (delivery): the HTTP config-fetch is now a parallel-failover hedge (qfg-7h5d.1.14).** On every init/refresh config fetch the SDK fires the **primary** `api_urls` leg first; if it answers within `config_fetch_hedge_delay_ms` it **wins and the secondary is never contacted** (cold standby — a healthy system adds zero secondary load). If the primary is slow past the hedge delay **or** errors fast, the SDK **also** fires the secondary **in parallel** without cancelling the primary. Whatever arrives is installed through the existing reject-older guard, so watermark-max falls out for free: the higher generation wins, a late older payload never regresses an established client, and a late newer payload heals forward. Readiness latches on the first successful install; a late-but-newer leg heals forward afterward. SSE is untouched — only the HTTP config-fetch path hedges. Both legs failing preserves the existing init-failure semantics (`on_init_failure`).
|
|
6
|
+
- **Feat (options): two additive hedge knobs.** `config_fetch_hedge_delay_ms` (default `2000`) is how long the hedge waits for the primary before also firing the secondary in parallel. `config_fetch_hedge_abort_ms` (default `6000`) is the per-leg hard-abort deadline on the hedged path; it must exceed the longest healable primary latency so a late-but-newer primary heals forward, and must be below `init_timeout_ms` so the init-path heal leg is not clipped (the client logs a one-time warning at construction if `init_timeout_ms <= config_fetch_hedge_abort_ms` with a secondary leg configured). The existing `config_fetch_timeout_ms` is unchanged and still governs the sequential / single-URL fetch path.
|
|
7
|
+
- **Backward-compatible behavioral notes (additive minor).** (1) In a fast-both topology where both legs answer well inside the hedge delay, `resolved_from` may now report `"primary"` where 1.0.0's sequential fetch could report `"secondary"` — a fast healthy primary now always wins. (2) On a heal-forward (a late newer leg landing after readiness latched), an **extra** post-ready `on_update` config-update callback may fire as the client converges on the higher generation. (3) ETags are now tracked **per leg** rather than as a single shared value, so a 304 from one leg can no longer mask the other and the two concurrent legs no longer race on a shared ETag.
|
|
8
|
+
- **Install-guard carve-out for unversioned snapshots:** a delivery payload whose `generation` is absent or `<= 0` (e.g. from a server that predates the generation watermark) is installed by an established client rather than rejected as older. Defensive back-compat guard — with servers that emit true generations it never triggers.
|
|
9
|
+
|
|
3
10
|
## 1.0.0 - 2026-06-06
|
|
4
11
|
|
|
5
12
|
- **Stable 1.0.0 release.** The Quonfig Ruby SDK is now declared stable. No API or
|
data/lib/quonfig/client.rb
CHANGED
|
@@ -409,6 +409,48 @@ module Quonfig
|
|
|
409
409
|
end
|
|
410
410
|
end
|
|
411
411
|
|
|
412
|
+
# ---- Failover + canonical-ordering diagnostics (qfg-7h5d.1.9) ------
|
|
413
|
+
#
|
|
414
|
+
# Read-only signals surfaced for the failover/ordering chaos probe and for
|
|
415
|
+
# operators. Like #connection_state / #last_successful_refresh these are
|
|
416
|
+
# DIAGNOSTIC ONLY — do not wire them into a liveness probe.
|
|
417
|
+
|
|
418
|
+
# True once the SDK has installed at least one envelope (any source). The
|
|
419
|
+
# failover scenarios assert the client reaches readiness off the secondary
|
|
420
|
+
# leg inside the init budget even when the primary is refused/hung/slow.
|
|
421
|
+
def ready?
|
|
422
|
+
!last_successful_refresh.nil?
|
|
423
|
+
end
|
|
424
|
+
|
|
425
|
+
# Meta.generation of the currently-held envelope (0 before the first install
|
|
426
|
+
# or when the backend does not emit a generation). Canonical ordering: an
|
|
427
|
+
# established client never regresses to a lower generation.
|
|
428
|
+
def held_generation
|
|
429
|
+
@config_loader&.held_generation || 0
|
|
430
|
+
end
|
|
431
|
+
|
|
432
|
+
# Count of envelopes actually installed. Rejected-older and same-generation
|
|
433
|
+
# snapshots do NOT bump this, so o04 can assert "no flap" via a stable count.
|
|
434
|
+
def config_install_count
|
|
435
|
+
@config_loader&.install_count || 0
|
|
436
|
+
end
|
|
437
|
+
|
|
438
|
+
# 'primary' / 'secondary' / '' — which config_api_urls leg produced the
|
|
439
|
+
# currently-held config. Used to assert HTTP config-fetch failover (f01-f04).
|
|
440
|
+
def resolved_from
|
|
441
|
+
@config_loader&.resolved_from || ''
|
|
442
|
+
end
|
|
443
|
+
|
|
444
|
+
# True if the live SSE stream has ever repointed to a non-primary leg. The
|
|
445
|
+
# failover epic asserts this stays false (f05): SSE does not fail over.
|
|
446
|
+
def sse_failed_over_to_secondary?
|
|
447
|
+
sse = @sse_client
|
|
448
|
+
return false if sse.nil?
|
|
449
|
+
return false unless sse.respond_to?(:failed_over_to_secondary?)
|
|
450
|
+
|
|
451
|
+
sse.failed_over_to_secondary?
|
|
452
|
+
end
|
|
453
|
+
|
|
412
454
|
def fork
|
|
413
455
|
self.class.new(@options.for_fork)
|
|
414
456
|
end
|
|
@@ -770,6 +812,7 @@ module Quonfig
|
|
|
770
812
|
raise Quonfig::Errors::InvalidSdkKeyError, @options.sdk_key if @options.sdk_key.nil? || @options.sdk_key.to_s.strip.empty?
|
|
771
813
|
|
|
772
814
|
warn_if_pin_ignored_in_delivery_mode
|
|
815
|
+
warn_if_hedge_abort_exceeds_init_timeout
|
|
773
816
|
|
|
774
817
|
@config_loader = Quonfig::ConfigLoader.new(@store, @options)
|
|
775
818
|
|
|
@@ -847,6 +890,30 @@ module Quonfig
|
|
|
847
890
|
)
|
|
848
891
|
end
|
|
849
892
|
|
|
893
|
+
# qfg-7h5d.1.14: the per-leg hedge abort MUST be < init_timeout_ms, otherwise
|
|
894
|
+
# the init-path heal leg is clipped by the overall init deadline before it can
|
|
895
|
+
# heal forward. Mirrors sdk-go's construction-time warning in options.go. Warn
|
|
896
|
+
# once at init in delivery mode; does not change behavior.
|
|
897
|
+
def warn_if_hedge_abort_exceeds_init_timeout
|
|
898
|
+
return unless @options.respond_to?(:config_fetch_hedge_abort_ms)
|
|
899
|
+
# The hedge (and its heal leg) only engages with a secondary leg; with a
|
|
900
|
+
# single config_api_url there is no heal leg to clip, so the warning would
|
|
901
|
+
# be misleading.
|
|
902
|
+
return unless Array(@options.config_api_urls).length >= 2
|
|
903
|
+
|
|
904
|
+
abort_ms = @options.config_fetch_hedge_abort_ms
|
|
905
|
+
init_ms = @options.init_timeout_ms
|
|
906
|
+
return if abort_ms.nil? || init_ms.nil?
|
|
907
|
+
return if init_ms > abort_ms
|
|
908
|
+
|
|
909
|
+
LOG.warn(
|
|
910
|
+
"[quonfig] init_timeout_ms (#{init_ms}ms) <= config_fetch_hedge_abort_ms " \
|
|
911
|
+
"(#{abort_ms}ms); the hedged config-fetch heal leg may be clipped by the " \
|
|
912
|
+
'init deadline before it can heal forward. Set init_timeout_ms above the ' \
|
|
913
|
+
'hedge abort.'
|
|
914
|
+
)
|
|
915
|
+
end
|
|
916
|
+
|
|
850
917
|
def handle_init_failure(err)
|
|
851
918
|
if @options.on_init_failure == Quonfig::Options::ON_INITIALIZATION_FAILURE::RETURN
|
|
852
919
|
LOG.warn "[quonfig] Initialization did not complete cleanly; continuing with empty store: #{err.message}"
|
|
@@ -18,7 +18,18 @@ module Quonfig
|
|
|
18
18
|
|
|
19
19
|
CONFIGS_PATH = '/api/v2/configs'
|
|
20
20
|
|
|
21
|
-
attr_reader :
|
|
21
|
+
attr_reader :version, :environment_id
|
|
22
|
+
|
|
23
|
+
# qfg-7h5d.1.9 (canonical ordering). Diagnostic surface read by the failover/
|
|
24
|
+
# ordering chaos probe and by operators:
|
|
25
|
+
# held_generation — Meta.generation of the currently-installed envelope
|
|
26
|
+
# (nil before the first install).
|
|
27
|
+
# install_count — number of envelopes actually installed (rejected-older
|
|
28
|
+
# and same-generation snapshots do NOT bump this).
|
|
29
|
+
# resolved_from — 'primary' / 'secondary' / '' — which config_api_urls leg
|
|
30
|
+
# produced the currently-held config (HTTP installs only;
|
|
31
|
+
# SSE does not change it).
|
|
32
|
+
attr_reader :held_generation, :install_count
|
|
22
33
|
|
|
23
34
|
# +store+: the Quonfig::ConfigStore to populate on successful fetch.
|
|
24
35
|
# +options+: a Quonfig::Options instance (supplies sdk_key + config_api_urls).
|
|
@@ -37,32 +48,94 @@ module Quonfig
|
|
|
37
48
|
end
|
|
38
49
|
|
|
39
50
|
@api_config = Concurrent::Map.new
|
|
40
|
-
|
|
51
|
+
# qfg-7h5d.1.14: per-leg ETag is load-bearing for the parallel hedge. The
|
|
52
|
+
# hedge runs the primary and secondary legs concurrently; a SINGLE shared
|
|
53
|
+
# ETag would (a) let a 304 from one leg mask the other and (b) be a data
|
|
54
|
+
# race with two legs writing it. Each leg keeps its own slot keyed by
|
|
55
|
+
# config_api_urls index, guarded by @etag_mutex (snapshot before the
|
|
56
|
+
# request, write-back after — the network wait happens with no lock held).
|
|
57
|
+
@etags = {}
|
|
58
|
+
@etag_mutex = Mutex.new
|
|
41
59
|
@version = nil
|
|
42
60
|
@environment_id = nil
|
|
43
61
|
@logger = logger || LOG
|
|
62
|
+
|
|
63
|
+
# Canonical-ordering state (qfg-7h5d.1.9). @install_mutex makes the
|
|
64
|
+
# guard-check-and-install atomic across every install path (initial fetch,
|
|
65
|
+
# failover/poll fetch, SSE snapshot, SSE update, fallback poller) — these
|
|
66
|
+
# run on different threads and must never interleave a stale install with a
|
|
67
|
+
# fresh one.
|
|
68
|
+
@held_generation = nil
|
|
69
|
+
@install_count = 0
|
|
70
|
+
@resolved_from_index = nil
|
|
71
|
+
@install_mutex = Mutex.new
|
|
72
|
+
end
|
|
73
|
+
|
|
74
|
+
# Backward-compatible reader: the primary leg's last ETag. Pre-hedge this was
|
|
75
|
+
# a single shared @etag; per-leg isolation now means index 0 is the canonical
|
|
76
|
+
# "the ETag" for callers/tests that read one value.
|
|
77
|
+
def etag
|
|
78
|
+
@etag_mutex.synchronize { @etags[0] }
|
|
44
79
|
end
|
|
45
80
|
|
|
46
|
-
#
|
|
47
|
-
#
|
|
48
|
-
|
|
81
|
+
# 'primary' / 'secondary' / '' for the leg that produced the currently-held
|
|
82
|
+
# config (config_api_urls index 0 = primary, 1 = secondary).
|
|
83
|
+
def resolved_from
|
|
84
|
+
case @resolved_from_index
|
|
85
|
+
when nil then ''
|
|
86
|
+
when 0 then 'primary'
|
|
87
|
+
when 1 then 'secondary'
|
|
88
|
+
else "url#{@resolved_from_index}"
|
|
89
|
+
end
|
|
90
|
+
end
|
|
91
|
+
|
|
92
|
+
# Fetch configs from /api/v2/configs with per-leg ETag / If-None-Match caching.
|
|
93
|
+
#
|
|
94
|
+
# qfg-7h5d.1.14 — PARALLEL-FAILOVER HEDGE. On every init/refresh fetch the
|
|
95
|
+
# PRIMARY leg (config_api_urls[0]) is fired first, on the CALLING thread. If
|
|
96
|
+
# it answers within config_fetch_hedge_delay_ms it WINS and the secondary is
|
|
97
|
+
# NEVER contacted (cold standby — zero extra load on a healthy system). If the
|
|
98
|
+
# primary is SLOW past the hedge delay OR errors fast, the SECONDARY leg
|
|
99
|
+
# (config_api_urls[1]) is ALSO fired IN PARALLEL on a background thread,
|
|
100
|
+
# at-most-once — the primary is NOT cancelled. Whatever arrives is installed
|
|
101
|
+
# through the EXISTING reject-older guard (#install_envelope), so watermark-MAX
|
|
102
|
+
# falls out for free: a higher generation wins, a late OLDER payload never
|
|
103
|
+
# regresses an established client, and a late NEWER payload heals forward.
|
|
104
|
+
#
|
|
105
|
+
# fetch! returns as soon as the FIRST leg installs (readiness latches off it);
|
|
106
|
+
# any still-running leg keeps running on its own thread, bounded by
|
|
107
|
+
# config_fetch_hedge_abort_ms, and heals forward if it lands a newer
|
|
108
|
+
# generation. There is NO coalescing/in-flight gate — overlapping fetches are
|
|
109
|
+
# safe (per-leg ETag isolation + every install serialized through
|
|
110
|
+
# @install_mutex + the reject-older guard + each leg bounded by the abort), and
|
|
111
|
+
# a coalescing gate would make a manual refresh silently no-op (a contract
|
|
112
|
+
# violation).
|
|
49
113
|
#
|
|
50
114
|
# Returns one of:
|
|
51
|
-
# :updated --
|
|
52
|
-
# :not_modified -- 304
|
|
53
|
-
# :failed -- every
|
|
115
|
+
# :updated -- at least one leg installed a 200 envelope
|
|
116
|
+
# :not_modified -- a leg answered 304 (no change) and nothing installed
|
|
117
|
+
# :failed -- every fired leg failed
|
|
54
118
|
def fetch!
|
|
55
|
-
Array(@options.config_api_urls)
|
|
56
|
-
|
|
57
|
-
|
|
58
|
-
|
|
59
|
-
|
|
119
|
+
urls = Array(@options.config_api_urls)
|
|
120
|
+
return :failed if urls.empty?
|
|
121
|
+
|
|
122
|
+
# Single leg (or no secondary configured): no hedge, just fetch on the
|
|
123
|
+
# calling thread under the SEQUENTIAL per-URL timeout (config_fetch_timeout_ms
|
|
124
|
+
# is unchanged and still governs any non-hedged path). Preserves the
|
|
125
|
+
# synchronous, single-request-per-call shape the legacy/mock callers depend
|
|
126
|
+
# on.
|
|
127
|
+
return fetch_from(urls[0], 0, timeout_ms: config_fetch_timeout_ms) if urls.length < 2
|
|
128
|
+
|
|
129
|
+
fetch_hedged(urls)
|
|
60
130
|
end
|
|
61
131
|
|
|
62
132
|
# Apply a ConfigEnvelope (from SSE) to the store. Called by the SSE client
|
|
63
|
-
# when a new event arrives.
|
|
133
|
+
# when a new event arrives. SSE is a single-leg live stream, so it carries no
|
|
134
|
+
# config_api_urls index and does not change #resolved_from — but it IS
|
|
135
|
+
# guarded by the same reject-older rule (a late SSE snapshot must not regress
|
|
136
|
+
# an established client).
|
|
64
137
|
def apply_envelope(envelope)
|
|
65
|
-
install_envelope(envelope, source: :sse)
|
|
138
|
+
install_envelope(envelope, source: :sse, source_index: nil)
|
|
66
139
|
end
|
|
67
140
|
|
|
68
141
|
def calc_config
|
|
@@ -83,19 +156,149 @@ module Quonfig
|
|
|
83
156
|
|
|
84
157
|
private
|
|
85
158
|
|
|
86
|
-
|
|
87
|
-
|
|
159
|
+
# Hedge orchestration (qfg-7h5d.1.14). Fires the primary leg on its own
|
|
160
|
+
# thread; if it is slow past the hedge delay OR errors fast, ALSO fires the
|
|
161
|
+
# secondary in parallel — at-most-once, never after a fast primary win, never
|
|
162
|
+
# cancelling the primary. Both legs push their settled result to a shared
|
|
163
|
+
# queue. fetch! returns as soon as the FIRST leg INSTALLS (so readiness
|
|
164
|
+
# latches off it); the other leg keeps running on its own thread, bounded by
|
|
165
|
+
# the hedge abort inside fetch_from, and heals forward through the
|
|
166
|
+
# reject-older guard. We never join the slow leg — a hung primary must not
|
|
167
|
+
# block a successful secondary install.
|
|
168
|
+
def fetch_hedged(urls)
|
|
169
|
+
hedge_delay_s = hedge_delay_ms / 1000.0
|
|
170
|
+
abort_ms = hedge_abort_ms
|
|
171
|
+
|
|
172
|
+
# Each fired leg pushes exactly one [:done, index, result] message. A
|
|
173
|
+
# SizedQueue large enough for both legs so a finished leg never blocks on
|
|
174
|
+
# push after we've stopped draining.
|
|
175
|
+
results = Queue.new
|
|
176
|
+
|
|
177
|
+
# At-most-once secondary gate. The mutex makes "a fast primary win
|
|
178
|
+
# suppresses the secondary" and "the hedge-delay elapsing fires it"
|
|
179
|
+
# mutually exclusive — exactly one of suppress/fire wins.
|
|
180
|
+
gate = Mutex.new
|
|
181
|
+
secondary_fired = false
|
|
182
|
+
|
|
183
|
+
run_leg = lambda do |index|
|
|
184
|
+
Thread.new do
|
|
185
|
+
result = begin
|
|
186
|
+
fetch_from(urls[index], index, timeout_ms: abort_ms)
|
|
187
|
+
rescue StandardError => e
|
|
188
|
+
@logger.debug "Hedge leg #{index} failed: #{e.class}: #{e.message}"
|
|
189
|
+
:failed
|
|
190
|
+
end
|
|
191
|
+
results.push([:done, index, result])
|
|
192
|
+
end
|
|
193
|
+
end
|
|
194
|
+
|
|
195
|
+
fire_secondary = lambda do
|
|
196
|
+
spawn = gate.synchronize do
|
|
197
|
+
next false if secondary_fired
|
|
198
|
+
|
|
199
|
+
secondary_fired = true
|
|
200
|
+
end
|
|
201
|
+
run_leg.call(1) if spawn
|
|
202
|
+
end
|
|
203
|
+
|
|
204
|
+
suppress_secondary = lambda do
|
|
205
|
+
gate.synchronize { secondary_fired = true }
|
|
206
|
+
end
|
|
207
|
+
|
|
208
|
+
# The primary always runs. A separate hedge-delay timer thread fires the
|
|
209
|
+
# secondary if the primary has not settled by then — without waiting for
|
|
210
|
+
# the primary to finish.
|
|
211
|
+
primary_thread = run_leg.call(0)
|
|
212
|
+
|
|
213
|
+
hedge_timer = Thread.new do
|
|
214
|
+
sleep hedge_delay_s
|
|
215
|
+
# If the primary is still in flight at the hedge delay, hedge in parallel.
|
|
216
|
+
fire_secondary.call if primary_thread.alive?
|
|
217
|
+
end
|
|
218
|
+
|
|
219
|
+
installed = false
|
|
220
|
+
saw_not_modified = false
|
|
221
|
+
drained = 0
|
|
222
|
+
|
|
223
|
+
# Drain leg results until the FIRST install latches readiness, or until
|
|
224
|
+
# every fired leg has reported (so a both-fail / both-304 cycle still
|
|
225
|
+
# terminates). `fired` is read under the gate because the secondary can be
|
|
226
|
+
# spawned concurrently by the timer or the primary's fast-error path.
|
|
227
|
+
loop do
|
|
228
|
+
fired = gate.synchronize { secondary_fired ? 2 : 1 }
|
|
229
|
+
break if drained >= fired && results.empty?
|
|
230
|
+
|
|
231
|
+
_tag, index, result = results.pop
|
|
232
|
+
drained += 1
|
|
233
|
+
|
|
234
|
+
case result
|
|
235
|
+
when :failed
|
|
236
|
+
# A fast primary error must hedge immediately (do not wait for the
|
|
237
|
+
# timer). The gate keeps the secondary at-most-once.
|
|
238
|
+
fire_secondary.call if index.zero?
|
|
239
|
+
when :not_modified
|
|
240
|
+
saw_not_modified = true
|
|
241
|
+
else # :updated -> a real install
|
|
242
|
+
installed = true
|
|
243
|
+
# If the PRIMARY just won inside the hedge window, close the gate so a
|
|
244
|
+
# racing timer can never fire the secondary — the cold-standby promise.
|
|
245
|
+
suppress_secondary.call if index.zero?
|
|
246
|
+
break
|
|
247
|
+
end
|
|
248
|
+
end
|
|
249
|
+
|
|
250
|
+
# Stop the timer if it is still sleeping (already-fired is harmless).
|
|
251
|
+
hedge_timer.kill if hedge_timer.alive?
|
|
252
|
+
|
|
253
|
+
return :updated if installed
|
|
254
|
+
return :not_modified if saw_not_modified
|
|
255
|
+
|
|
256
|
+
:failed
|
|
257
|
+
end
|
|
258
|
+
|
|
259
|
+
def hedge_delay_ms
|
|
260
|
+
if @options.respond_to?(:config_fetch_hedge_delay_ms) && @options.config_fetch_hedge_delay_ms
|
|
261
|
+
@options.config_fetch_hedge_delay_ms
|
|
262
|
+
else
|
|
263
|
+
Quonfig::Options::DEFAULT_CONFIG_FETCH_HEDGE_DELAY_MS
|
|
264
|
+
end
|
|
265
|
+
end
|
|
266
|
+
|
|
267
|
+
def hedge_abort_ms
|
|
268
|
+
if @options.respond_to?(:config_fetch_hedge_abort_ms) && @options.config_fetch_hedge_abort_ms
|
|
269
|
+
@options.config_fetch_hedge_abort_ms
|
|
270
|
+
else
|
|
271
|
+
Quonfig::Options::DEFAULT_CONFIG_FETCH_HEDGE_ABORT_MS
|
|
272
|
+
end
|
|
273
|
+
end
|
|
274
|
+
|
|
275
|
+
def config_fetch_timeout_ms
|
|
276
|
+
@options.respond_to?(:config_fetch_timeout_ms) ? @options.config_fetch_timeout_ms : nil
|
|
277
|
+
end
|
|
278
|
+
|
|
279
|
+
def fetch_from(source, index = nil, timeout_ms: nil)
|
|
280
|
+
# qfg-7h5d.1.9 / .1.14: bound this single per-leg attempt so a hung upstream
|
|
281
|
+
# aborts (Faraday::TimeoutError, caught below as :failed). On the hedged
|
|
282
|
+
# path the caller passes config_fetch_hedge_abort_ms; on the sequential /
|
|
283
|
+
# single-URL path it passes config_fetch_timeout_ms.
|
|
284
|
+
conn = Quonfig::HttpConnection.new(source, @options.sdk_key, timeout_ms: timeout_ms)
|
|
88
285
|
headers = {}
|
|
89
|
-
|
|
286
|
+
# Per-leg ETag: snapshot this leg's slot before the request (no lock held
|
|
287
|
+
# during the network wait).
|
|
288
|
+
etag = etag_for(index)
|
|
289
|
+
headers['If-None-Match'] = etag if etag
|
|
90
290
|
response = conn.get(CONFIGS_PATH, headers)
|
|
91
291
|
|
|
92
292
|
case response.status
|
|
93
293
|
when 200
|
|
94
294
|
new_etag = response.headers['ETag'] || response.headers['etag']
|
|
95
295
|
envelope = parse_envelope(response.body)
|
|
96
|
-
install_envelope(envelope, source: source)
|
|
97
|
-
|
|
98
|
-
|
|
296
|
+
result = install_envelope(envelope, source: source, source_index: index)
|
|
297
|
+
# Write this leg's ETag back AFTER the response (per-leg, race-free).
|
|
298
|
+
set_etag_for(index, new_etag)
|
|
299
|
+
# install_envelope returns :not_modified when the reject-older guard drops
|
|
300
|
+
# an equal/older payload — surface that so the caller doesn't double-count.
|
|
301
|
+
result == :not_modified ? :not_modified : :updated
|
|
99
302
|
when 304
|
|
100
303
|
@logger.debug "Configs not modified (304) from #{source}"
|
|
101
304
|
:not_modified
|
|
@@ -114,6 +317,14 @@ module Quonfig
|
|
|
114
317
|
:failed
|
|
115
318
|
end
|
|
116
319
|
|
|
320
|
+
def etag_for(index)
|
|
321
|
+
@etag_mutex.synchronize { @etags[index || 0] }
|
|
322
|
+
end
|
|
323
|
+
|
|
324
|
+
def set_etag_for(index, value)
|
|
325
|
+
@etag_mutex.synchronize { @etags[index || 0] = value }
|
|
326
|
+
end
|
|
327
|
+
|
|
117
328
|
def parse_envelope(body)
|
|
118
329
|
data = body.is_a?(String) ? JSON.parse(body) : body
|
|
119
330
|
Quonfig::ConfigEnvelope.new(
|
|
@@ -129,37 +340,74 @@ module Quonfig
|
|
|
129
340
|
str.length > 200 ? "#{str[0, 200]}..." : str
|
|
130
341
|
end
|
|
131
342
|
|
|
132
|
-
def install_envelope(envelope, source:)
|
|
133
|
-
|
|
134
|
-
|
|
135
|
-
|
|
136
|
-
|
|
137
|
-
|
|
343
|
+
def install_envelope(envelope, source:, source_index: nil)
|
|
344
|
+
meta = envelope.meta || {}
|
|
345
|
+
incoming_gen = extract_generation(meta)
|
|
346
|
+
|
|
347
|
+
@install_mutex.synchronize do
|
|
348
|
+
# Reject-older install guard (canonical ordering, §5f). A fresh client
|
|
349
|
+
# (no held generation) seeds off whatever arrives first — even an older
|
|
350
|
+
# or gen-0 snapshot. An established client installs ONLY when the incoming
|
|
351
|
+
# generation strictly advances the held one: a same-generation snapshot is
|
|
352
|
+
# a no-op (no store churn, no install-count bump, no resolved-from change)
|
|
353
|
+
# so a duplicate leg never flaps an established client, and an OLDER
|
|
354
|
+
# snapshot (a stale secondary reached on failover) is dropped so the client
|
|
355
|
+
# never regresses. Reject-older is the whole rule — no source ranking; a
|
|
356
|
+
# newer primary landing late heals forward automatically. Applies on every
|
|
357
|
+
# network install path (initial fetch, failover/poll fetch, SSE snapshot,
|
|
358
|
+
# SSE update, fallback poller); datadir install bypasses this (it is the
|
|
359
|
+
# local source of truth and goes through Client#apply_datadir_envelope).
|
|
360
|
+
# Carve-out: an UNVERSIONED snapshot (generation <= 0 — a server that
|
|
361
|
+
# predates the watermark, or one whose rev-count failed) carries no
|
|
362
|
+
# ordering info, so it is never rejected as "older"; freezing the client
|
|
363
|
+
# on stale config would be worse (mirrors sdk-node).
|
|
364
|
+
unless @held_generation.nil? || incoming_gen <= 0 || incoming_gen > @held_generation
|
|
365
|
+
@logger.debug "Reject-older guard: dropping incoming generation #{incoming_gen} <= held #{@held_generation} (source=#{source})"
|
|
366
|
+
return :not_modified
|
|
367
|
+
end
|
|
138
368
|
|
|
139
|
-
|
|
140
|
-
|
|
141
|
-
|
|
369
|
+
# Update internal tracking map (for legacy callers / introspection).
|
|
370
|
+
next_map = Concurrent::Map.new
|
|
371
|
+
envelope.configs.each do |cfg|
|
|
372
|
+
key = config_key(cfg)
|
|
373
|
+
next if key.nil?
|
|
142
374
|
|
|
143
|
-
|
|
144
|
-
|
|
145
|
-
|
|
375
|
+
next_map[key] = { source: source, config: cfg }
|
|
376
|
+
end
|
|
377
|
+
@api_config = next_map
|
|
378
|
+
|
|
379
|
+
@version = meta['version'] || meta[:version] || @version
|
|
380
|
+
@environment_id = meta['environment'] || meta[:environment] || @environment_id
|
|
146
381
|
|
|
147
|
-
|
|
148
|
-
|
|
382
|
+
@held_generation = incoming_gen
|
|
383
|
+
@install_count += 1
|
|
384
|
+
@resolved_from_index = source_index unless source_index.nil?
|
|
149
385
|
|
|
150
|
-
|
|
151
|
-
|
|
152
|
-
# Drop keys that disappeared server-side.
|
|
153
|
-
(old_keys - new_keys).each { |k| @store.delete(k) } if @store.respond_to?(:delete)
|
|
386
|
+
# Replace the live store atomically.
|
|
387
|
+
return if @store.nil?
|
|
154
388
|
|
|
155
|
-
|
|
156
|
-
|
|
157
|
-
|
|
389
|
+
new_keys = next_map.keys.to_set
|
|
390
|
+
old_keys = @store.keys.to_set
|
|
391
|
+
# Drop keys that disappeared server-side.
|
|
392
|
+
(old_keys - new_keys).each { |k| @store.delete(k) } if @store.respond_to?(:delete)
|
|
158
393
|
|
|
159
|
-
|
|
394
|
+
envelope.configs.each do |cfg|
|
|
395
|
+
key = config_key(cfg)
|
|
396
|
+
next if key.nil?
|
|
397
|
+
|
|
398
|
+
@store.set(key, cfg)
|
|
399
|
+
end
|
|
160
400
|
end
|
|
161
401
|
end
|
|
162
402
|
|
|
403
|
+
# Read Meta.generation (qfg-7h5d.1.1) — the monotonic per-branch commit
|
|
404
|
+
# counter the backend stamps on every envelope. Absent/garbage → 0 (an old
|
|
405
|
+
# backend that doesn't emit it, or fixture mode with no FIXTURE_GENERATION).
|
|
406
|
+
def extract_generation(meta)
|
|
407
|
+
g = meta['generation'] || meta[:generation]
|
|
408
|
+
g.is_a?(Numeric) ? g.to_i : 0
|
|
409
|
+
end
|
|
410
|
+
|
|
163
411
|
def config_key(cfg)
|
|
164
412
|
return cfg['key'] || cfg[:key] if cfg.is_a?(Hash)
|
|
165
413
|
|
|
@@ -13,9 +13,17 @@ module Quonfig
|
|
|
13
13
|
'X-Quonfig-SDK-Version' => SDK_VERSION
|
|
14
14
|
}.freeze
|
|
15
15
|
|
|
16
|
-
|
|
16
|
+
# +timeout_ms+ (qfg-7h5d.1.9): per-request bound applied to BOTH the connect
|
|
17
|
+
# (open) and read phases of every request made through this connection. nil
|
|
18
|
+
# leaves Faraday's defaults (no timeout) in place — preserving the prior
|
|
19
|
+
# behavior for callers that don't pass one. The config-fetch path passes
|
|
20
|
+
# Options#config_fetch_timeout_ms so a hung upstream (accepts the TCP
|
|
21
|
+
# connection but never responds) aborts fast instead of blocking the caller's
|
|
22
|
+
# whole init budget.
|
|
23
|
+
def initialize(uri, sdk_key, timeout_ms: nil)
|
|
17
24
|
@uri = uri
|
|
18
25
|
@sdk_key = sdk_key
|
|
26
|
+
@timeout_ms = timeout_ms
|
|
19
27
|
end
|
|
20
28
|
|
|
21
29
|
attr_reader :uri
|
|
@@ -32,6 +40,15 @@ module Quonfig
|
|
|
32
40
|
merged = JSON_HEADERS.merge('Authorization' => auth_header).merge(headers)
|
|
33
41
|
Faraday.new(@uri) do |conn|
|
|
34
42
|
conn.headers.merge!(merged)
|
|
43
|
+
if @timeout_ms
|
|
44
|
+
seconds = @timeout_ms / 1000.0
|
|
45
|
+
# open_timeout bounds the TCP connect; timeout bounds the read. A
|
|
46
|
+
# 'timeout' toxic accepts the connection but never sends bytes, so the
|
|
47
|
+
# read deadline is the one that fires — set both so a refused/slow
|
|
48
|
+
# connect is bounded too.
|
|
49
|
+
conn.options.open_timeout = seconds
|
|
50
|
+
conn.options.timeout = seconds
|
|
51
|
+
end
|
|
35
52
|
end
|
|
36
53
|
end
|
|
37
54
|
|
data/lib/quonfig/options.rb
CHANGED
|
@@ -7,7 +7,8 @@ module Quonfig
|
|
|
7
7
|
class Options
|
|
8
8
|
attr_reader :sdk_key, :environment, :api_urls, :sse_api_urls, :telemetry_destination, :config_api_urls,
|
|
9
9
|
:on_no_default, :init_timeout_ms, :on_init_failure, :collect_sync_interval, :datadir, :enable_sse, :fallback_poll_enabled, :fallback_poll_interval_ms, :global_context, :logger_key, :logger, :enable_quonfig_user_context,
|
|
10
|
-
:data_dir_auto_reload, :data_dir_auto_reload_debounce_ms
|
|
10
|
+
:data_dir_auto_reload, :data_dir_auto_reload_debounce_ms, :config_fetch_timeout_ms,
|
|
11
|
+
:config_fetch_hedge_delay_ms, :config_fetch_hedge_abort_ms
|
|
11
12
|
attr_accessor :is_fork
|
|
12
13
|
|
|
13
14
|
# Default fallback poll interval, in milliseconds. The SDK polls api-delivery
|
|
@@ -18,6 +19,38 @@ module Quonfig
|
|
|
18
19
|
# long for the initial config fetch before failing per :on_init_failure.
|
|
19
20
|
DEFAULT_INIT_TIMEOUT_MS = 10_000
|
|
20
21
|
|
|
22
|
+
# Default per-URL config-fetch timeout, in milliseconds (qfg-7h5d.1.9). Each
|
|
23
|
+
# leg in config_api_urls gets its own bounded attempt on the initial fetch
|
|
24
|
+
# AND the fallback poller, so a hung primary aborts fast (~3s) and leaves
|
|
25
|
+
# budget to reach the secondary inside init_timeout_ms instead of starving it
|
|
26
|
+
# until the global deadline. ~3s is short enough to fail over well inside a
|
|
27
|
+
# default 10s init budget, long enough to tolerate a slow-but-healthy
|
|
28
|
+
# upstream. Additive + a default that already fails over → backward
|
|
29
|
+
# compatible, not a breaking change.
|
|
30
|
+
DEFAULT_CONFIG_FETCH_TIMEOUT_MS = 3_000
|
|
31
|
+
|
|
32
|
+
# Default hedge delay, in milliseconds (qfg-7h5d.1.14). On the init/refresh
|
|
33
|
+
# config-fetch the SDK fires the PRIMARY leg first; if it has not settled
|
|
34
|
+
# within this delay (or errors fast) the SDK ALSO fires the secondary leg in
|
|
35
|
+
# PARALLEL without cancelling the primary. ~2s is below a realistic
|
|
36
|
+
# slow-but-alive primary's worst case yet far enough below the per-leg abort
|
|
37
|
+
# that a healthy sub-second primary is NEVER hedged — the secondary stays a
|
|
38
|
+
# cold standby and a healthy system adds zero secondary load. Standardized to
|
|
39
|
+
# 2000ms across all backend SDKs (qfg-7h5d.1.14). Tunable via
|
|
40
|
+
# +config_fetch_hedge_delay_ms+. Additive + backward compatible.
|
|
41
|
+
DEFAULT_CONFIG_FETCH_HEDGE_DELAY_MS = 2_000
|
|
42
|
+
|
|
43
|
+
# Default per-leg hedge hard-abort deadline, in milliseconds (qfg-7h5d.1.14).
|
|
44
|
+
# The hedged config-fetch path bounds each leg by this instead of
|
|
45
|
+
# #config_fetch_timeout_ms (which still governs the sequential FetchConfigs
|
|
46
|
+
# path). It MUST exceed the longest healable primary latency so a late-but-
|
|
47
|
+
# newer primary heals forward (rather than aborting), and MUST be <
|
|
48
|
+
# init_timeout_ms so the init-path heal leg is not clipped — the client logs a
|
|
49
|
+
# warning at construction if init_timeout_ms <= this value. ~6s sits between a
|
|
50
|
+
# ~3s slow-but-healthy upstream and the default 10s init budget. Tunable via
|
|
51
|
+
# +config_fetch_hedge_abort_ms+. Additive + backward compatible.
|
|
52
|
+
DEFAULT_CONFIG_FETCH_HEDGE_ABORT_MS = 6_000
|
|
53
|
+
|
|
21
54
|
# Deprecated alias for #fallback_poll_enabled. Will be removed in a future
|
|
22
55
|
# minor release.
|
|
23
56
|
def enable_polling
|
|
@@ -184,6 +217,9 @@ module Quonfig
|
|
|
184
217
|
init_timeout_ms: nil,
|
|
185
218
|
initialization_timeout_sec: nil,
|
|
186
219
|
on_init_failure: ON_INITIALIZATION_FAILURE::RAISE,
|
|
220
|
+
config_fetch_timeout_ms: nil,
|
|
221
|
+
config_fetch_hedge_delay_ms: nil,
|
|
222
|
+
config_fetch_hedge_abort_ms: nil,
|
|
187
223
|
collect_max_paths: DEFAULT_MAX_PATHS,
|
|
188
224
|
collect_sync_interval: nil,
|
|
189
225
|
context_upload_mode: :periodic_example, # :periodic_example, :shapes_only, :none
|
|
@@ -238,6 +274,14 @@ module Quonfig
|
|
|
238
274
|
DEFAULT_INIT_TIMEOUT_MS
|
|
239
275
|
end
|
|
240
276
|
@on_init_failure = on_init_failure
|
|
277
|
+
# qfg-7h5d.1.9: per-URL config-fetch timeout. nil → DEFAULT_CONFIG_FETCH_TIMEOUT_MS.
|
|
278
|
+
@config_fetch_timeout_ms = config_fetch_timeout_ms || DEFAULT_CONFIG_FETCH_TIMEOUT_MS
|
|
279
|
+
# qfg-7h5d.1.14: parallel-failover hedge knobs. nil → defaults. The hedge
|
|
280
|
+
# delay is when the secondary ALSO fires in parallel; the hedge abort is the
|
|
281
|
+
# per-leg hard deadline on the hedged path (the sequential FetchConfigs path
|
|
282
|
+
# keeps using config_fetch_timeout_ms).
|
|
283
|
+
@config_fetch_hedge_delay_ms = config_fetch_hedge_delay_ms || DEFAULT_CONFIG_FETCH_HEDGE_DELAY_MS
|
|
284
|
+
@config_fetch_hedge_abort_ms = config_fetch_hedge_abort_ms || DEFAULT_CONFIG_FETCH_HEDGE_ABORT_MS
|
|
241
285
|
|
|
242
286
|
@collect_max_paths = collect_max_paths
|
|
243
287
|
@collect_sync_interval = collect_sync_interval
|
|
@@ -89,6 +89,19 @@ module Quonfig
|
|
|
89
89
|
|
|
90
90
|
@source_index = -1
|
|
91
91
|
@last_event_id = nil
|
|
92
|
+
# qfg-7h5d.1.9: latches true if the SSE reconnect rotation ever selects a
|
|
93
|
+
# non-primary (index > 0) sse_api_urls leg. The failover epic asserts SSE
|
|
94
|
+
# does NOT fail over to the secondary (f05) — it stays on its own endpoint
|
|
95
|
+
# and degrades via the single-upstream SSE↔HTTP fallback. With a single SSE
|
|
96
|
+
# URL configured this can never flip; the flag makes the design choice
|
|
97
|
+
# observable (and would surface a regression if SSE were given two legs).
|
|
98
|
+
@failed_over_to_secondary = false
|
|
99
|
+
end
|
|
100
|
+
|
|
101
|
+
# True if the live SSE stream has ever connected to a non-primary leg. Read
|
|
102
|
+
# by the failover chaos probe (f05). See @failed_over_to_secondary.
|
|
103
|
+
def failed_over_to_secondary?
|
|
104
|
+
@failed_over_to_secondary
|
|
92
105
|
end
|
|
93
106
|
|
|
94
107
|
# Layer 1 (SSE) reconnect counter. Bumped exactly once per reconnect
|
|
@@ -398,6 +411,7 @@ module Quonfig
|
|
|
398
411
|
def current_url
|
|
399
412
|
urls = @prefab_options.sse_api_urls
|
|
400
413
|
@source_index = (@source_index + 1) % urls.size
|
|
414
|
+
@failed_over_to_secondary = true if @source_index.positive?
|
|
401
415
|
urls[@source_index]
|
|
402
416
|
end
|
|
403
417
|
|
data/lib/quonfig/version.rb
CHANGED
metadata
CHANGED
|
@@ -1,14 +1,14 @@
|
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
|
2
2
|
name: quonfig
|
|
3
3
|
version: !ruby/object:Gem::Version
|
|
4
|
-
version: 1.
|
|
4
|
+
version: 1.1.0
|
|
5
5
|
platform: ruby
|
|
6
6
|
authors:
|
|
7
7
|
- Jeff Dwyer
|
|
8
8
|
autorequire:
|
|
9
9
|
bindir: bin
|
|
10
10
|
cert_chain: []
|
|
11
|
-
date: 2026-
|
|
11
|
+
date: 2026-07-01 00:00:00.000000000 Z
|
|
12
12
|
dependencies:
|
|
13
13
|
- !ruby/object:Gem::Dependency
|
|
14
14
|
name: activesupport
|