lex-microsoft_teams 0.6.50 → 0.6.51

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 35010357ecbf55560b6957ae2467c5975eec04056da102a34b1d4c870a0f1ce3
4
- data.tar.gz: 38d375c241ba2691948bf6d80f275410f0a1003e0e7456852f3bc2ca1bf4acce
3
+ metadata.gz: 76994e2ead987d49e26c776ff30d9e6a2b78789a12654d38e80e36e81814d703
4
+ data.tar.gz: ebf842a7daf81e017078e894d111d38f9474ecc1b3ced0f4105cea29dfb1b7fb
5
5
  SHA512:
6
- metadata.gz: ae71a656141975e85dc7b5e294e99d7d5a1f67a34e57feea943f177d4bf1c31224ef462adc5140163f6cb028ca1fede816fa713c25185522d8718fe01631a0af
7
- data.tar.gz: b2e865c65531f7d464d7f058c2e8d965b1ca96643ca15bee7a6018c0dd260d7477771b0418404f9a6dd0ef8fc8dd983d8c68d0a06704fb2432af031d377150d6
6
+ metadata.gz: ac9c034b99ef410ab0c1727f83409cd8d8d063b20082ace208be85e0a1bbd4c8314b983caf4c2a80af82d9ca6245e3616d67dae3f60cdff4a204bea087b2d777
7
+ data.tar.gz: 1b13b4c36751a4dd02888c3867acafd0594d6a1a83e00668481a07533997de3bfce5abd68b66fc56adbb5bce6006601d1994daa16ebdb3738bd9309f0ff244a1
data/CHANGELOG.md CHANGED
@@ -1,5 +1,25 @@
1
1
  # Changelog
2
2
 
3
+ ## [0.6.51] - 2026-05-27
4
+ ### Added
5
+ - `Faraday::RetryAfter` middleware honoring `Retry-After` per RFC 9110 §10.2.3 (originally RFC 7231 §7.1.3) in both delta-seconds and HTTP-date forms. Retries HTTP 429 by default; 503/504 opt-in. Bounded by `max_retries` and cumulative `max_wait`; ±jitter on advertised wait to avoid thundering-herd. Configurable via `microsoft_teams.client.retry.{max_retries,max_wait,jitter,fallback_wait,retry_statuses}`; `fetch` semantics preserve explicit falsey values.
6
+ - `Errors::Throttled` exception with `status`, `retry_after` (nullable; `nil` distinguishes "no server guidance" from "retry immediately"), `retry_after_known?` predicate, `request`, and `attempts`.
7
+ - `Faraday::RetryAfter.parse_header` shared class method (single source of truth for Retry-After parsing).
8
+ - `default_settings` for all actors with explicit `enabled`, `interval`, and tuning knobs. Actors no longer need nil guards — values are guaranteed by the extension settings merge on boot. Defaults: `api_ingest` 3600s, `incremental_sync` 900s, `presence_poller` disabled, `channel_poller` disabled, `direct_chat_poller` disabled, `observed_chat_poller` disabled, `meeting_ingest` 900s, `profile_ingest` enabled (once on boot).
9
+ - `Helpers::GraphCache` module — `cached_graph_get` wraps Graph API calls with `Legion::Cache` TTL-based caching. Supports `shared: true` for resource-scoped endpoints (e.g., `/chats/{id}/members` — same data for all participants) vs user-scoped for `/me/*` endpoints. Cache keys incorporate the process identity UUID to prevent cross-user leakage.
10
+
11
+ ### Fixed
12
+ - Stuck or chatty consumers no longer brown out other users on the same Entra app registration's Graph quota. The `RetryAfter` middleware now raises `Errors::Throttled` **centrally on exhaustion** — every consumer of `graph_connection` and `bot_connection` gets the typed event without per-callsite handling. Fixes #18.
13
+ - `Helpers::GraphClient#handle_graph_response` retains a defensive 429/503/504 branch that raises `Errors::Throttled` for callers that build a Faraday connection without the middleware (custom tests, ad-hoc tooling).
14
+ - Logger acquisition failures no longer silently drop retry telemetry — falls back to `Legion::Logging` unconditionally; loss of those signals was how the original outage went undiagnosed for days.
15
+ - **O(N×M) member scan eliminated** — `ProfileIngest#find_chat_for_person` and `ApiIngest#match_chat_to_person` previously called `GET /chats/{id}/members` for every chat × every person (up to 500+ calls per tick). Replaced with `build_chat_member_index` that fetches members once per chat and builds an in-memory lookup hash. Reduces ~514 calls/tick to ~50 for `IncrementalSync`; ~7,500 calls/tick to ~65 for `ApiIngest`.
16
+ - `IncrementalSync` interval raised from 120s to 900s (was the single largest source of sustained Graph pressure).
17
+ - `ApiIngest` interval raised from 1800s to 3600s.
18
+ - Actors that were accidentally re-enabled by `952607c` (rubocop removed `return false` guards) are now properly gated by `settings[:actor_name][:enabled]` — `presence_poller`, `channel_poller`, `direct_chat_poller`, and `observed_chat_poller` all default to `false`.
19
+
20
+ ### Known follow-up
21
+ - Actors (`*_poller.rb`, `meeting_ingest`, `profile_ingest`) still catch `Errors::Throttled` via the generic `rescue StandardError` block but do not yet *defer their next scheduled run* using the carried `retry_after`. To be addressed in a follow-up issue.
22
+
3
23
  ## [0.6.50] - 2026-05-27
4
24
  ### Added
5
25
  - Full OData query parameter support across all Graph API runner methods per Microsoft Graph REST v1.0 docs
@@ -27,12 +27,12 @@ module Legion
27
27
  end
28
28
 
29
29
  def time
30
- interval = teams_settings.dig(:ingest, :api_interval) || 1800
31
- interval.to_i
30
+ teams_settings.dig(:api_ingest, :interval)
32
31
  end
33
32
 
34
33
  def enabled?
35
- defined?(Legion::Extensions::Agentic::Memory::Trace::Runners::Traces)
34
+ teams_settings.dig(:api_ingest, :enabled) &&
35
+ defined?(Legion::Extensions::Agentic::Memory::Trace::Runners::Traces)
36
36
  rescue StandardError => e
37
37
  handle_exception(e, level: :warn, operation: 'ApiIngest#enabled?')
38
38
  false
@@ -46,13 +46,13 @@ module Legion
46
46
  return
47
47
  end
48
48
 
49
- ingest = teams_settings[:ingest] || {}
49
+ ai_settings = teams_settings[:api_ingest]
50
50
  log.info('ApiIngest: starting Graph API ingest')
51
51
  result = runner_class.ingest_api(
52
52
  token: token,
53
- top_people: ingest.fetch(:top_people, 15),
54
- message_depth: ingest.fetch(:message_depth, 50),
55
- skip_bots: ingest.fetch(:skip_bots, true),
53
+ top_people: ai_settings[:top_people],
54
+ message_depth: ai_settings[:message_depth],
55
+ skip_bots: ai_settings[:skip_bots],
56
56
  imprint_active: imprint_active?
57
57
  )
58
58
  log.info("ApiIngest: #{result.inspect[0, 200]}")
@@ -75,12 +75,7 @@ module Legion
75
75
  end
76
76
 
77
77
  def teams_settings
78
- return {} unless defined?(Legion::Settings)
79
-
80
- Legion::Settings[:microsoft_teams] || {}
81
- rescue StandardError => e
82
- handle_exception(e, level: :warn, operation: 'ApiIngest#teams_settings')
83
- {}
78
+ settings
84
79
  end
85
80
 
86
81
  def imprint_active?
@@ -7,10 +7,6 @@ module Legion
7
7
  class ChannelPoller < Legion::Extensions::Actors::Every
8
8
  include Legion::Extensions::MicrosoftTeams::Helpers::Client
9
9
 
10
- DEFAULT_INTERVAL = 60
11
- DEFAULT_MAX_TEAMS = 10
12
- DEFAULT_MAX_CHANNELS = 5
13
-
14
10
  def initialize(**opts)
15
11
  return unless enabled?
16
12
 
@@ -20,7 +16,7 @@ module Legion
20
16
 
21
17
  def runner_class = self.class
22
18
  def runner_function = 'manual'
23
- def time = channel_setting(:poll_interval, DEFAULT_INTERVAL)
19
+ def time = settings.dig(:channel_poller, :interval)
24
20
  def delay = 300
25
21
  def run_now? = false
26
22
  def use_runner? = false
@@ -36,7 +32,7 @@ module Legion
36
32
  end
37
33
 
38
34
  def enabled?
39
- # channel_setting(:enabled, false) == true
35
+ settings.dig(:channel_poller, :enabled)
40
36
  rescue StandardError => e
41
37
  handle_exception(e, level: :debug, operation: 'ChannelPoller#enabled?')
42
38
  false
@@ -148,11 +144,11 @@ module Legion
148
144
  end
149
145
 
150
146
  def max_teams
151
- channel_setting(:max_teams, DEFAULT_MAX_TEAMS)
147
+ settings.dig(:channel_poller, :max_teams)
152
148
  end
153
149
 
154
150
  def max_channels_per_team
155
- channel_setting(:max_channels_per_team, DEFAULT_MAX_CHANNELS)
151
+ settings.dig(:channel_poller, :max_channels_per_team)
156
152
  end
157
153
 
158
154
  def delegated_token
@@ -162,17 +158,8 @@ module Legion
162
158
  nil
163
159
  end
164
160
 
165
- def channel_setting(key, default)
166
- return default unless defined?(Legion::Settings)
167
-
168
- Legion::Settings.dig(:microsoft_teams, :channels, key) || default
169
- rescue StandardError => e
170
- handle_exception(e, level: :debug, operation: "ChannelPoller#channel_setting(#{key})")
171
- default
172
- end
173
-
174
161
  def channel_traces_enabled?
175
- channel_setting(:store_traces, false) == true
162
+ settings.dig(:channel_poller, :store_traces) == true
176
163
  end
177
164
 
178
165
  def store_channel_message_trace(team_name:, channel_name:, msg:)
@@ -8,8 +8,6 @@ module Legion
8
8
  include Legion::Extensions::MicrosoftTeams::Helpers::Client
9
9
  include Legion::Extensions::MicrosoftTeams::Helpers::HighWaterMark
10
10
 
11
- POLL_INTERVAL = 15
12
-
13
11
  def initialize(**opts)
14
12
  return unless enabled?
15
13
 
@@ -19,7 +17,7 @@ module Legion
19
17
 
20
18
  def runner_class = Legion::Extensions::MicrosoftTeams::Runners::Bot
21
19
  def runner_function = 'handle_message'
22
- def time = settings_interval(:direct_poll_interval, POLL_INTERVAL)
20
+ def time = settings.dig(:direct_chat_poller, :interval)
23
21
  def delay = 60
24
22
  def run_now? = false
25
23
  def use_runner? = false
@@ -27,7 +25,8 @@ module Legion
27
25
  def generate_task? = false
28
26
 
29
27
  def enabled?
30
- defined?(Legion::Extensions::MicrosoftTeams::Runners::Bot) &&
28
+ settings.dig(:direct_chat_poller, :enabled) &&
29
+ defined?(Legion::Extensions::MicrosoftTeams::Runners::Bot) &&
31
30
  Legion.const_defined?(:Transport, false)
32
31
  rescue StandardError => e
33
32
  handle_exception(e, level: :debug, operation: 'DirectChatPoller#enabled?')
@@ -95,9 +94,7 @@ module Legion
95
94
  end
96
95
 
97
96
  def bot_id_from_settings
98
- return nil unless defined?(Legion::Settings)
99
-
100
- Legion::Settings.dig(:microsoft_teams, :bot, :bot_id)
97
+ settings.dig(:bot, :bot_id)
101
98
  end
102
99
 
103
100
  def delegated_token
@@ -106,12 +103,6 @@ module Legion
106
103
  handle_exception(e, level: :warn, operation: 'DirectChatPoller#delegated_token')
107
104
  nil
108
105
  end
109
-
110
- def settings_interval(key, default)
111
- return default unless defined?(Legion::Settings)
112
-
113
- Legion::Settings.dig(:microsoft_teams, :bot, key) || default
114
- end
115
106
  end
116
107
  end
117
108
  end
@@ -14,17 +14,12 @@ module Legion
14
14
  def delay = 60
15
15
 
16
16
  def time
17
- settings = begin
18
- Legion::Settings[:microsoft_teams] || {}
19
- rescue StandardError => e
20
- handle_exception(e, level: :debug, operation: 'IncrementalSync#time')
21
- {}
22
- end
23
- settings.dig(:ingest, :incremental_interval) || 120
17
+ settings.dig(:incremental_sync, :interval)
24
18
  end
25
19
 
26
20
  def enabled?
27
- defined?(Legion::Extensions::Agentic::Memory::Trace::Runners::Traces) &&
21
+ settings.dig(:incremental_sync, :enabled) &&
22
+ defined?(Legion::Extensions::Agentic::Memory::Trace::Runners::Traces) &&
28
23
  token_available?
29
24
  rescue StandardError => e
30
25
  handle_exception(e, level: :debug, operation: 'IncrementalSync#enabled?')
@@ -36,17 +31,11 @@ module Legion
36
31
  token = resolve_token
37
32
  return unless token
38
33
 
39
- settings = begin
40
- Legion::Settings[:microsoft_teams] || {}
41
- rescue StandardError => e
42
- handle_exception(e, level: :debug, operation: 'IncrementalSync#manual settings')
43
- {}
44
- end
45
- ingest = settings[:ingest] || {}
34
+ is_settings = settings[:incremental_sync]
46
35
  runner_class.incremental_sync(
47
36
  token: token,
48
- top_people: ingest.fetch(:top_people, 10),
49
- message_depth: ingest.fetch(:message_depth, 50)
37
+ top_people: is_settings[:top_people],
38
+ message_depth: is_settings[:message_depth]
50
39
  )
51
40
  rescue StandardError => e
52
41
  handle_exception(e, level: :error, operation: 'IncrementalSync#manual')
@@ -7,8 +7,6 @@ module Legion
7
7
  class MeetingIngest < Legion::Extensions::Actors::Every
8
8
  include Legion::Extensions::MicrosoftTeams::Helpers::Client
9
9
 
10
- DEFAULT_INGEST_INTERVAL = 300
11
-
12
10
  def runner_class = self.class
13
11
  def runner_function = 'manual'
14
12
  def run_now? = false
@@ -22,17 +20,12 @@ module Legion
22
20
  end
23
21
 
24
22
  def time
25
- settings = begin
26
- Legion::Settings[:microsoft_teams] || {}
27
- rescue StandardError => e
28
- handle_exception(e, level: :debug, operation: 'MeetingIngest#time')
29
- {}
30
- end
31
- settings.dig(:meetings, :ingest_interval) || DEFAULT_INGEST_INTERVAL
23
+ settings.dig(:meeting_ingest, :interval)
32
24
  end
33
25
 
34
26
  def enabled?
35
- Legion::Extensions::Identity::Entra::Helpers::TokenManager.respond_to?(:load_token)
27
+ settings.dig(:meeting_ingest, :enabled) &&
28
+ Legion::Extensions::Identity::Entra::Helpers::TokenManager.respond_to?(:load_token)
36
29
  rescue StandardError => e
37
30
  handle_exception(e, level: :debug, operation: 'MeetingIngest#enabled?')
38
31
  false
@@ -8,8 +8,6 @@ module Legion
8
8
  include Legion::Extensions::MicrosoftTeams::Helpers::Client
9
9
  include Legion::Extensions::MicrosoftTeams::Helpers::HighWaterMark
10
10
 
11
- POLL_INTERVAL = 30
12
-
13
11
  def initialize(**opts)
14
12
  return unless enabled?
15
13
 
@@ -18,7 +16,7 @@ module Legion
18
16
 
19
17
  def runner_class = Legion::Extensions::MicrosoftTeams::Runners::Bot
20
18
  def runner_function = 'observe_message'
21
- def time = settings_interval(:observe_poll_interval, POLL_INTERVAL)
19
+ def time = settings.dig(:observed_chat_poller, :interval)
22
20
  def delay = 180
23
21
  def run_now? = false
24
22
  def use_runner? = false
@@ -26,11 +24,9 @@ module Legion
26
24
  def generate_task? = false
27
25
 
28
26
  def enabled?
29
- return false unless defined?(Legion::Extensions::MicrosoftTeams::Runners::Bot)
30
- return false unless Legion.const_defined?(:Transport, false)
31
- return false unless defined?(Legion::Settings)
32
-
33
- Legion::Settings.dig(:microsoft_teams, :bot, :observe, :enabled) == true
27
+ settings.dig(:observed_chat_poller, :enabled) &&
28
+ defined?(Legion::Extensions::MicrosoftTeams::Runners::Bot) &&
29
+ Legion.const_defined?(:Transport, false)
34
30
  rescue StandardError => e
35
31
  handle_exception(e, level: :debug, operation: 'ObservedChatPoller#enabled?')
36
32
  false
@@ -106,12 +102,6 @@ module Legion
106
102
  handle_exception(e, level: :warn, operation: 'ObservedChatPoller#delegated_token')
107
103
  nil
108
104
  end
109
-
110
- def settings_interval(key, default)
111
- return default unless defined?(Legion::Settings)
112
-
113
- Legion::Settings.dig(:microsoft_teams, :bot, key) || default
114
- end
115
105
  end
116
106
  end
117
107
  end
@@ -7,8 +7,6 @@ module Legion
7
7
  class PresencePoller < Legion::Extensions::Actors::Every
8
8
  include Legion::Extensions::MicrosoftTeams::Helpers::Client
9
9
 
10
- DEFAULT_POLL_INTERVAL = 60
11
-
12
10
  def runner_class = self.class
13
11
  def runner_function = 'manual'
14
12
  def run_now? = false
@@ -17,13 +15,12 @@ module Legion
17
15
  def generate_task? = false
18
16
 
19
17
  def time
20
- return DEFAULT_POLL_INTERVAL unless defined?(Legion::Settings)
21
-
22
- Legion::Settings.dig(:microsoft_teams, :presence, :poll_interval) || DEFAULT_POLL_INTERVAL
18
+ settings.dig(:presence_poller, :interval)
23
19
  end
24
20
 
25
21
  def enabled?
26
- Legion::Extensions::Identity::Entra::Helpers::TokenManager.respond_to?(:load_token)
22
+ settings.dig(:presence_poller, :enabled) &&
23
+ Legion::Extensions::Identity::Entra::Helpers::TokenManager.respond_to?(:load_token)
27
24
  rescue StandardError => e
28
25
  handle_exception(e, level: :debug, operation: 'PresencePoller#enabled?')
29
26
  false
@@ -21,7 +21,8 @@ module Legion
21
21
  end
22
22
 
23
23
  def enabled?
24
- defined?(Legion::Extensions::Agentic::Memory::Trace::Runners::Traces) &&
24
+ settings.dig(:profile_ingest, :enabled) &&
25
+ defined?(Legion::Extensions::Agentic::Memory::Trace::Runners::Traces) &&
25
26
  token_available?
26
27
  rescue StandardError => e
27
28
  handle_exception(e, level: :debug, operation: 'ProfileIngest#enabled?')
@@ -37,17 +38,11 @@ module Legion
37
38
  end
38
39
  log.info('ProfileIngest: token acquired, starting ingest')
39
40
 
40
- settings = begin
41
- Legion::Settings[:microsoft_teams] || {}
42
- rescue StandardError => e
43
- handle_exception(e, level: :debug, operation: 'ProfileIngest#manual settings')
44
- {}
45
- end
46
- ingest = settings[:ingest] || {}
41
+ pi_settings = settings[:profile_ingest]
47
42
  runner_class.full_ingest(
48
43
  token: token,
49
- top_people: ingest.fetch(:top_people, 10),
50
- message_depth: ingest.fetch(:message_depth, 50)
44
+ top_people: pi_settings[:top_people],
45
+ message_depth: pi_settings[:message_depth]
51
46
  )
52
47
  rescue StandardError => e
53
48
  handle_exception(e, level: :error, operation: 'ProfileIngest#manual')
@@ -0,0 +1,84 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Legion
4
+ module Extensions
5
+ module MicrosoftTeams
6
+ module Errors
7
+ # Raised when Microsoft Graph (or the Bot Framework) throttles the
8
+ # caller and the retry policy has been exhausted, or when an actor
9
+ # wants to surface a throttle event without retrying further.
10
+ #
11
+ # `retry_after` carries the last advertised Retry-After interval
12
+ # (in seconds) as parsed from the upstream header. **It is nil when
13
+ # the server returned no Retry-After header or one we could not
14
+ # parse** — callers must check `retry_after_known?` before treating
15
+ # the value as a server directive. Conflating "header missing" with
16
+ # "retry immediately" was the bug the original fleet outage exposed.
17
+ class Throttled < StandardError
18
+ attr_reader :status, :retry_after, :request, :attempts
19
+
20
+ # @param status [Integer] the upstream HTTP status (e.g. 429)
21
+ # @param retry_after [Float, Integer, nil] seconds the server
22
+ # advised waiting, or nil if the header was absent/unparseable
23
+ # @param request [String, nil] the path or URL that was throttled
24
+ # @param attempts [Integer, nil] how many retries the middleware
25
+ # tried before giving up; nil means "not tracked"
26
+ def initialize(status:, retry_after: nil, request: nil, attempts: nil)
27
+ @status = coerce_status(status)
28
+ @retry_after = coerce_retry_after(retry_after)
29
+ @request = request
30
+ @attempts = attempts.nil? ? nil : Integer(attempts)
31
+ super(build_message)
32
+ end
33
+
34
+ # @return [Boolean] true when the upstream advised a specific wait
35
+ # interval; false when the header was missing or unparseable. Use
36
+ # this to decide whether to honor the wait verbatim or apply a
37
+ # local policy default before re-scheduling.
38
+ def retry_after_known?
39
+ !@retry_after.nil?
40
+ end
41
+
42
+ private
43
+
44
+ def coerce_status(value)
45
+ Integer(value)
46
+ # rubocop:disable Legion/RescueLogging/NoCapture
47
+ # No logger available during exception construction; we re-raise
48
+ # with a clearer message instead.
49
+ rescue ArgumentError, TypeError
50
+ raise ArgumentError, "Throttled status must be an Integer, got #{value.inspect}"
51
+ # rubocop:enable Legion/RescueLogging/NoCapture
52
+ end
53
+
54
+ def coerce_retry_after(value)
55
+ return nil if value.nil?
56
+
57
+ seconds = Float(value)
58
+ seconds.negative? ? 0.0 : seconds
59
+ # rubocop:disable Legion/RescueLogging/NoCapture
60
+ # Unparseable retry_after intentionally collapses to nil so the
61
+ # public `retry_after_known?` predicate is the single source of
62
+ # truth for "did the server give us usable guidance." Logging
63
+ # belongs at the parse site (Faraday::RetryAfter), not here.
64
+ rescue ArgumentError, TypeError
65
+ nil
66
+ # rubocop:enable Legion/RescueLogging/NoCapture
67
+ end
68
+
69
+ def build_message
70
+ parts = ["Microsoft Graph throttled (HTTP #{@status})"]
71
+ parts << "after #{@attempts} attempt(s)" if @attempts
72
+ parts << if retry_after_known?
73
+ "retry_after=#{format('%.2f', @retry_after)}s"
74
+ else
75
+ 'retry_after=unknown'
76
+ end
77
+ parts << "request=#{@request}" if @request
78
+ parts.join('; ')
79
+ end
80
+ end
81
+ end
82
+ end
83
+ end
84
+ end
@@ -0,0 +1,209 @@
1
+ # frozen_string_literal: true
2
+
3
+ require 'faraday'
4
+ require 'time'
5
+ require 'legion/extensions/microsoft_teams/errors'
6
+
7
+ module Legion
8
+ module Extensions
9
+ module MicrosoftTeams
10
+ module Faraday
11
+ # Faraday middleware that retries throttled responses honoring the
12
+ # upstream Retry-After header per RFC 9110 §10.2.3 (originally
13
+ # specified in RFC 7231 §7.1.3). Retries HTTP 429 by default; 503
14
+ # and 504 are opt-in via `retry_statuses:`.
15
+ #
16
+ # Retry-After is parsed in two forms:
17
+ #
18
+ # * delta-seconds (e.g. "120") — used as-is
19
+ # * HTTP-date (e.g. "Wed, 27 May 2026 12:00:00 GMT")
20
+ # — converted to delta
21
+ # from current UTC,
22
+ # clamped to >= 0
23
+ #
24
+ # The advertised wait is jittered by ±(jitter * wait) to avoid
25
+ # thundering-herd behavior across instances sharing one Entra app
26
+ # registration's Graph quota.
27
+ #
28
+ # When `max_retries` is reached, or cumulative wait would exceed
29
+ # `max_wait`, the middleware raises `Errors::Throttled` carrying
30
+ # the last advertised Retry-After (nil if the header was missing or
31
+ # unparseable), the final HTTP status, attempt count, and request
32
+ # path. Raising centrally — rather than returning a raw 429 and
33
+ # trusting every caller to detect it — is the difference between
34
+ # one typed event the fleet can defer on, and 60+ runner callsites
35
+ # that silently treat throttle envelopes as data.
36
+ class RetryAfter < ::Faraday::Middleware
37
+ DEFAULT_MAX_RETRIES = 3
38
+ DEFAULT_MAX_WAIT = 60.0
39
+ DEFAULT_JITTER = 0.2
40
+ DEFAULT_FALLBACK_WAIT = 1.0
41
+ DEFAULT_RETRY_STATUSES = [429].freeze
42
+
43
+ # Parse an HTTP Retry-After header value.
44
+ #
45
+ # @param raw [String, nil] the raw header value
46
+ # @param clock [#call] a callable returning current UTC Time,
47
+ # injectable for deterministic tests
48
+ # @return [Float, nil] seconds to wait, or nil if `raw` is absent,
49
+ # empty, or neither a numeric delta nor a valid HTTP-date
50
+ def self.parse_header(raw, clock: -> { Time.now.utc })
51
+ return nil if raw.nil?
52
+
53
+ value = raw.to_s.strip
54
+ return nil if value.empty?
55
+ return value.to_f if value.match?(/\A\d+(\.\d+)?\z/)
56
+
57
+ begin
58
+ target = Time.httpdate(value).utc
59
+ [(target - clock.call), 0.0].max
60
+ # rubocop:disable Legion/RescueLogging/NoCapture
61
+ # Pure parser — no logger access. The instance method
62
+ # `parse_advertised` warns on the same condition with full
63
+ # context; logging twice would just be noise.
64
+ rescue ArgumentError
65
+ nil
66
+ # rubocop:enable Legion/RescueLogging/NoCapture
67
+ end
68
+ end
69
+
70
+ def initialize(app, # rubocop:disable Metrics/ParameterLists
71
+ max_retries: DEFAULT_MAX_RETRIES,
72
+ max_wait: DEFAULT_MAX_WAIT,
73
+ jitter: DEFAULT_JITTER,
74
+ fallback_wait: DEFAULT_FALLBACK_WAIT,
75
+ retry_statuses: DEFAULT_RETRY_STATUSES,
76
+ sleeper: ->(seconds) { sleep(seconds) },
77
+ logger: nil,
78
+ clock: -> { Time.now.utc })
79
+ super(app)
80
+ @max_retries = Integer(max_retries)
81
+ @max_wait = Float(max_wait)
82
+ @jitter = Float(jitter)
83
+ @fallback_wait = Float(fallback_wait)
84
+ @retry_statuses = Array(retry_statuses).map(&:to_i).freeze
85
+ @sleeper = sleeper
86
+ @logger = logger
87
+ @clock = clock
88
+ end
89
+
90
+ def call(env)
91
+ attempts = 0
92
+ total_wait = 0.0
93
+ last_advertised = nil
94
+
95
+ loop do
96
+ response = @app.call(env.dup)
97
+ return response unless retryable?(response.status)
98
+
99
+ last_advertised = parse_advertised(response)
100
+ wait = compute_wait(last_advertised)
101
+
102
+ if attempts >= @max_retries || (total_wait + wait) > @max_wait
103
+ log_giveup(env, response, attempts, total_wait)
104
+ raise Errors::Throttled.new(
105
+ status: response.status,
106
+ retry_after: last_advertised,
107
+ request: request_path(env),
108
+ attempts: attempts
109
+ )
110
+ end
111
+
112
+ attempts += 1
113
+ total_wait += wait
114
+ log_retry(env, response, wait, attempts)
115
+ @sleeper.call(wait)
116
+ end
117
+ end
118
+
119
+ private
120
+
121
+ def retryable?(status)
122
+ @retry_statuses.include?(status.to_i)
123
+ end
124
+
125
+ # Parse the advertised Retry-After value from a response. Returns
126
+ # nil if the header is missing or unparseable — callers branch on
127
+ # nil to distinguish "no server guidance" from "retry now".
128
+ def parse_advertised(response)
129
+ raw = retry_after_header(response)
130
+ parsed = self.class.parse_header(raw, clock: @clock)
131
+ @logger&.warn("[microsoft_teams][retry_after] unparseable Retry-After=#{raw.inspect}") if raw && !raw.to_s.strip.empty? && parsed.nil?
132
+ parsed
133
+ end
134
+
135
+ # Computes the actual wait the middleware will sleep before the
136
+ # next attempt. Falls back to `@fallback_wait` only when the
137
+ # server gave no usable guidance; jitter is always applied so
138
+ # concurrent instances don't synchronize their retries.
139
+ def compute_wait(advertised)
140
+ seconds = advertised || @fallback_wait
141
+ apply_jitter(seconds)
142
+ end
143
+
144
+ def retry_after_header(response)
145
+ headers = response_headers(response)
146
+ return nil unless headers
147
+
148
+ headers['Retry-After'] ||
149
+ headers['retry-after'] ||
150
+ headers['RETRY-AFTER']
151
+ end
152
+
153
+ # Faraday's Response exposes headers via #headers; some test
154
+ # doubles may not, so look in both common places.
155
+ def response_headers(response)
156
+ if response.respond_to?(:headers) && response.headers
157
+ response.headers
158
+ elsif response.respond_to?(:response_headers)
159
+ response.response_headers
160
+ end
161
+ end
162
+
163
+ def apply_jitter(seconds)
164
+ return seconds if @jitter.zero?
165
+
166
+ spread = seconds * @jitter
167
+ offset = ((rand * 2.0) - 1.0) * spread
168
+ wait = seconds + offset
169
+ wait.negative? ? 0.0 : wait
170
+ end
171
+
172
+ def request_path(env)
173
+ env.url.respond_to?(:path) ? env.url.path : env.url.to_s
174
+ # rubocop:disable Legion/RescueLogging/NoCapture
175
+ # Defensive fallback for malformed env.url; the path is only used
176
+ # for log lines and error messages, never for control flow.
177
+ rescue StandardError
178
+ nil
179
+ # rubocop:enable Legion/RescueLogging/NoCapture
180
+ end
181
+
182
+ def log_retry(env, response, wait, attempts)
183
+ return unless @logger
184
+
185
+ @logger.warn(
186
+ "[microsoft_teams][retry_after] status=#{response.status} " \
187
+ "method=#{env.method.to_s.upcase} path=#{request_path(env)} " \
188
+ "wait=#{format('%.2f', wait)}s attempt=#{attempts}"
189
+ )
190
+ rescue StandardError => e
191
+ warn("[microsoft_teams][retry_after] log_retry suppressed #{e.class}: #{e.message}")
192
+ end
193
+
194
+ def log_giveup(env, response, attempts, total_wait)
195
+ return unless @logger
196
+
197
+ @logger.error(
198
+ "[microsoft_teams][retry_after] giving up; status=#{response.status} " \
199
+ "method=#{env.method.to_s.upcase} path=#{request_path(env)} " \
200
+ "attempts=#{attempts} total_wait=#{format('%.2f', total_wait)}s"
201
+ )
202
+ rescue StandardError => e
203
+ warn("[microsoft_teams][retry_after] log_giveup suppressed #{e.class}: #{e.message}")
204
+ end
205
+ end
206
+ end
207
+ end
208
+ end
209
+ end