lex-microsoft_teams 0.6.50 → 0.6.51
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +20 -0
- data/lib/legion/extensions/microsoft_teams/actors/api_ingest.rb +8 -13
- data/lib/legion/extensions/microsoft_teams/actors/channel_poller.rb +5 -18
- data/lib/legion/extensions/microsoft_teams/actors/direct_chat_poller.rb +4 -13
- data/lib/legion/extensions/microsoft_teams/actors/incremental_sync.rb +6 -17
- data/lib/legion/extensions/microsoft_teams/actors/meeting_ingest.rb +3 -10
- data/lib/legion/extensions/microsoft_teams/actors/observed_chat_poller.rb +4 -14
- data/lib/legion/extensions/microsoft_teams/actors/presence_poller.rb +3 -6
- data/lib/legion/extensions/microsoft_teams/actors/profile_ingest.rb +5 -10
- data/lib/legion/extensions/microsoft_teams/errors.rb +84 -0
- data/lib/legion/extensions/microsoft_teams/faraday/retry_after.rb +209 -0
- data/lib/legion/extensions/microsoft_teams/faraday/throttle_circuit.rb +150 -0
- data/lib/legion/extensions/microsoft_teams/helpers/client.rb +80 -3
- data/lib/legion/extensions/microsoft_teams/helpers/graph_cache.rb +65 -0
- data/lib/legion/extensions/microsoft_teams/helpers/graph_client.rb +20 -0
- data/lib/legion/extensions/microsoft_teams/runners/api_ingest.rb +34 -22
- data/lib/legion/extensions/microsoft_teams/runners/profile_ingest.rb +23 -11
- data/lib/legion/extensions/microsoft_teams/version.rb +1 -1
- data/lib/legion/extensions/microsoft_teams.rb +59 -0
- metadata +5 -1
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: 76994e2ead987d49e26c776ff30d9e6a2b78789a12654d38e80e36e81814d703
|
|
4
|
+
data.tar.gz: ebf842a7daf81e017078e894d111d38f9474ecc1b3ced0f4105cea29dfb1b7fb
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: ac9c034b99ef410ab0c1727f83409cd8d8d063b20082ace208be85e0a1bbd4c8314b983caf4c2a80af82d9ca6245e3616d67dae3f60cdff4a204bea087b2d777
|
|
7
|
+
data.tar.gz: 1b13b4c36751a4dd02888c3867acafd0594d6a1a83e00668481a07533997de3bfce5abd68b66fc56adbb5bce6006601d1994daa16ebdb3738bd9309f0ff244a1
|
data/CHANGELOG.md
CHANGED
|
@@ -1,5 +1,25 @@
|
|
|
1
1
|
# Changelog
|
|
2
2
|
|
|
3
|
+
## [0.6.51] - 2026-05-27
|
|
4
|
+
### Added
|
|
5
|
+
- `Faraday::RetryAfter` middleware honoring `Retry-After` per RFC 9110 §10.2.3 (originally RFC 7231 §7.1.3) in both delta-seconds and HTTP-date forms. Retries HTTP 429 by default; 503/504 opt-in. Bounded by `max_retries` and cumulative `max_wait`; ±jitter on advertised wait to avoid thundering-herd. Configurable via `microsoft_teams.client.retry.{max_retries,max_wait,jitter,fallback_wait,retry_statuses}`; `fetch` semantics preserve explicit falsey values.
|
|
6
|
+
- `Errors::Throttled` exception with `status`, `retry_after` (nullable; `nil` distinguishes "no server guidance" from "retry immediately"), `retry_after_known?` predicate, `request`, and `attempts`.
|
|
7
|
+
- `Faraday::RetryAfter.parse_header` shared class method (single source of truth for Retry-After parsing).
|
|
8
|
+
- `default_settings` for all actors with explicit `enabled`, `interval`, and tuning knobs. Actors no longer need nil guards — values are guaranteed by the extension settings merge on boot. Defaults: `api_ingest` 3600s, `incremental_sync` 900s, `presence_poller` disabled, `channel_poller` disabled, `direct_chat_poller` disabled, `observed_chat_poller` disabled, `meeting_ingest` 900s, `profile_ingest` enabled (once on boot).
|
|
9
|
+
- `Helpers::GraphCache` module — `cached_graph_get` wraps Graph API calls with `Legion::Cache` TTL-based caching. Supports `shared: true` for resource-scoped endpoints (e.g., `/chats/{id}/members` — same data for all participants) vs user-scoped for `/me/*` endpoints. Cache keys incorporate the process identity UUID to prevent cross-user leakage.
|
|
10
|
+
|
|
11
|
+
### Fixed
|
|
12
|
+
- Stuck or chatty consumers no longer brown out other users on the same Entra app registration's Graph quota. The `RetryAfter` middleware now raises `Errors::Throttled` **centrally on exhaustion** — every consumer of `graph_connection` and `bot_connection` gets the typed event without per-callsite handling. Fixes #18.
|
|
13
|
+
- `Helpers::GraphClient#handle_graph_response` retains a defensive 429/503/504 branch that raises `Errors::Throttled` for callers that build a Faraday connection without the middleware (custom tests, ad-hoc tooling).
|
|
14
|
+
- Logger acquisition failures no longer silently drop retry telemetry — falls back to `Legion::Logging` unconditionally; loss of those signals was how the original outage went undiagnosed for days.
|
|
15
|
+
- **O(N×M) member scan eliminated** — `ProfileIngest#find_chat_for_person` and `ApiIngest#match_chat_to_person` previously called `GET /chats/{id}/members` for every chat × every person (up to 500+ calls per tick). Replaced with `build_chat_member_index` that fetches members once per chat and builds an in-memory lookup hash. Reduces ~514 calls/tick to ~50 for `IncrementalSync`; ~7,500 calls/tick to ~65 for `ApiIngest`.
|
|
16
|
+
- `IncrementalSync` interval raised from 120s to 900s (was the single largest source of sustained Graph pressure).
|
|
17
|
+
- `ApiIngest` interval raised from 1800s to 3600s.
|
|
18
|
+
- Actors that were accidentally re-enabled by `952607c` (rubocop removed `return false` guards) are now properly gated by `settings[:actor_name][:enabled]` — `presence_poller`, `channel_poller`, `direct_chat_poller`, and `observed_chat_poller` all default to `false`.
|
|
19
|
+
|
|
20
|
+
### Known follow-up
|
|
21
|
+
- Actors (`*_poller.rb`, `meeting_ingest`, `profile_ingest`) still catch `Errors::Throttled` via the generic `rescue StandardError` block but do not yet *defer their next scheduled run* using the carried `retry_after`. To be addressed in a follow-up issue.
|
|
22
|
+
|
|
3
23
|
## [0.6.50] - 2026-05-27
|
|
4
24
|
### Added
|
|
5
25
|
- Full OData query parameter support across all Graph API runner methods per Microsoft Graph REST v1.0 docs
|
|
@@ -27,12 +27,12 @@ module Legion
|
|
|
27
27
|
end
|
|
28
28
|
|
|
29
29
|
def time
|
|
30
|
-
|
|
31
|
-
interval.to_i
|
|
30
|
+
teams_settings.dig(:api_ingest, :interval)
|
|
32
31
|
end
|
|
33
32
|
|
|
34
33
|
def enabled?
|
|
35
|
-
|
|
34
|
+
teams_settings.dig(:api_ingest, :enabled) &&
|
|
35
|
+
defined?(Legion::Extensions::Agentic::Memory::Trace::Runners::Traces)
|
|
36
36
|
rescue StandardError => e
|
|
37
37
|
handle_exception(e, level: :warn, operation: 'ApiIngest#enabled?')
|
|
38
38
|
false
|
|
@@ -46,13 +46,13 @@ module Legion
|
|
|
46
46
|
return
|
|
47
47
|
end
|
|
48
48
|
|
|
49
|
-
|
|
49
|
+
ai_settings = teams_settings[:api_ingest]
|
|
50
50
|
log.info('ApiIngest: starting Graph API ingest')
|
|
51
51
|
result = runner_class.ingest_api(
|
|
52
52
|
token: token,
|
|
53
|
-
top_people:
|
|
54
|
-
message_depth:
|
|
55
|
-
skip_bots:
|
|
53
|
+
top_people: ai_settings[:top_people],
|
|
54
|
+
message_depth: ai_settings[:message_depth],
|
|
55
|
+
skip_bots: ai_settings[:skip_bots],
|
|
56
56
|
imprint_active: imprint_active?
|
|
57
57
|
)
|
|
58
58
|
log.info("ApiIngest: #{result.inspect[0, 200]}")
|
|
@@ -75,12 +75,7 @@ module Legion
|
|
|
75
75
|
end
|
|
76
76
|
|
|
77
77
|
def teams_settings
|
|
78
|
-
|
|
79
|
-
|
|
80
|
-
Legion::Settings[:microsoft_teams] || {}
|
|
81
|
-
rescue StandardError => e
|
|
82
|
-
handle_exception(e, level: :warn, operation: 'ApiIngest#teams_settings')
|
|
83
|
-
{}
|
|
78
|
+
settings
|
|
84
79
|
end
|
|
85
80
|
|
|
86
81
|
def imprint_active?
|
|
@@ -7,10 +7,6 @@ module Legion
|
|
|
7
7
|
class ChannelPoller < Legion::Extensions::Actors::Every
|
|
8
8
|
include Legion::Extensions::MicrosoftTeams::Helpers::Client
|
|
9
9
|
|
|
10
|
-
DEFAULT_INTERVAL = 60
|
|
11
|
-
DEFAULT_MAX_TEAMS = 10
|
|
12
|
-
DEFAULT_MAX_CHANNELS = 5
|
|
13
|
-
|
|
14
10
|
def initialize(**opts)
|
|
15
11
|
return unless enabled?
|
|
16
12
|
|
|
@@ -20,7 +16,7 @@ module Legion
|
|
|
20
16
|
|
|
21
17
|
def runner_class = self.class
|
|
22
18
|
def runner_function = 'manual'
|
|
23
|
-
def time =
|
|
19
|
+
def time = settings.dig(:channel_poller, :interval)
|
|
24
20
|
def delay = 300
|
|
25
21
|
def run_now? = false
|
|
26
22
|
def use_runner? = false
|
|
@@ -36,7 +32,7 @@ module Legion
|
|
|
36
32
|
end
|
|
37
33
|
|
|
38
34
|
def enabled?
|
|
39
|
-
|
|
35
|
+
settings.dig(:channel_poller, :enabled)
|
|
40
36
|
rescue StandardError => e
|
|
41
37
|
handle_exception(e, level: :debug, operation: 'ChannelPoller#enabled?')
|
|
42
38
|
false
|
|
@@ -148,11 +144,11 @@ module Legion
|
|
|
148
144
|
end
|
|
149
145
|
|
|
150
146
|
def max_teams
|
|
151
|
-
|
|
147
|
+
settings.dig(:channel_poller, :max_teams)
|
|
152
148
|
end
|
|
153
149
|
|
|
154
150
|
def max_channels_per_team
|
|
155
|
-
|
|
151
|
+
settings.dig(:channel_poller, :max_channels_per_team)
|
|
156
152
|
end
|
|
157
153
|
|
|
158
154
|
def delegated_token
|
|
@@ -162,17 +158,8 @@ module Legion
|
|
|
162
158
|
nil
|
|
163
159
|
end
|
|
164
160
|
|
|
165
|
-
def channel_setting(key, default)
|
|
166
|
-
return default unless defined?(Legion::Settings)
|
|
167
|
-
|
|
168
|
-
Legion::Settings.dig(:microsoft_teams, :channels, key) || default
|
|
169
|
-
rescue StandardError => e
|
|
170
|
-
handle_exception(e, level: :debug, operation: "ChannelPoller#channel_setting(#{key})")
|
|
171
|
-
default
|
|
172
|
-
end
|
|
173
|
-
|
|
174
161
|
def channel_traces_enabled?
|
|
175
|
-
|
|
162
|
+
settings.dig(:channel_poller, :store_traces) == true
|
|
176
163
|
end
|
|
177
164
|
|
|
178
165
|
def store_channel_message_trace(team_name:, channel_name:, msg:)
|
|
@@ -8,8 +8,6 @@ module Legion
|
|
|
8
8
|
include Legion::Extensions::MicrosoftTeams::Helpers::Client
|
|
9
9
|
include Legion::Extensions::MicrosoftTeams::Helpers::HighWaterMark
|
|
10
10
|
|
|
11
|
-
POLL_INTERVAL = 15
|
|
12
|
-
|
|
13
11
|
def initialize(**opts)
|
|
14
12
|
return unless enabled?
|
|
15
13
|
|
|
@@ -19,7 +17,7 @@ module Legion
|
|
|
19
17
|
|
|
20
18
|
def runner_class = Legion::Extensions::MicrosoftTeams::Runners::Bot
|
|
21
19
|
def runner_function = 'handle_message'
|
|
22
|
-
def time =
|
|
20
|
+
def time = settings.dig(:direct_chat_poller, :interval)
|
|
23
21
|
def delay = 60
|
|
24
22
|
def run_now? = false
|
|
25
23
|
def use_runner? = false
|
|
@@ -27,7 +25,8 @@ module Legion
|
|
|
27
25
|
def generate_task? = false
|
|
28
26
|
|
|
29
27
|
def enabled?
|
|
30
|
-
|
|
28
|
+
settings.dig(:direct_chat_poller, :enabled) &&
|
|
29
|
+
defined?(Legion::Extensions::MicrosoftTeams::Runners::Bot) &&
|
|
31
30
|
Legion.const_defined?(:Transport, false)
|
|
32
31
|
rescue StandardError => e
|
|
33
32
|
handle_exception(e, level: :debug, operation: 'DirectChatPoller#enabled?')
|
|
@@ -95,9 +94,7 @@ module Legion
|
|
|
95
94
|
end
|
|
96
95
|
|
|
97
96
|
def bot_id_from_settings
|
|
98
|
-
|
|
99
|
-
|
|
100
|
-
Legion::Settings.dig(:microsoft_teams, :bot, :bot_id)
|
|
97
|
+
settings.dig(:bot, :bot_id)
|
|
101
98
|
end
|
|
102
99
|
|
|
103
100
|
def delegated_token
|
|
@@ -106,12 +103,6 @@ module Legion
|
|
|
106
103
|
handle_exception(e, level: :warn, operation: 'DirectChatPoller#delegated_token')
|
|
107
104
|
nil
|
|
108
105
|
end
|
|
109
|
-
|
|
110
|
-
def settings_interval(key, default)
|
|
111
|
-
return default unless defined?(Legion::Settings)
|
|
112
|
-
|
|
113
|
-
Legion::Settings.dig(:microsoft_teams, :bot, key) || default
|
|
114
|
-
end
|
|
115
106
|
end
|
|
116
107
|
end
|
|
117
108
|
end
|
|
@@ -14,17 +14,12 @@ module Legion
|
|
|
14
14
|
def delay = 60
|
|
15
15
|
|
|
16
16
|
def time
|
|
17
|
-
settings
|
|
18
|
-
Legion::Settings[:microsoft_teams] || {}
|
|
19
|
-
rescue StandardError => e
|
|
20
|
-
handle_exception(e, level: :debug, operation: 'IncrementalSync#time')
|
|
21
|
-
{}
|
|
22
|
-
end
|
|
23
|
-
settings.dig(:ingest, :incremental_interval) || 120
|
|
17
|
+
settings.dig(:incremental_sync, :interval)
|
|
24
18
|
end
|
|
25
19
|
|
|
26
20
|
def enabled?
|
|
27
|
-
|
|
21
|
+
settings.dig(:incremental_sync, :enabled) &&
|
|
22
|
+
defined?(Legion::Extensions::Agentic::Memory::Trace::Runners::Traces) &&
|
|
28
23
|
token_available?
|
|
29
24
|
rescue StandardError => e
|
|
30
25
|
handle_exception(e, level: :debug, operation: 'IncrementalSync#enabled?')
|
|
@@ -36,17 +31,11 @@ module Legion
|
|
|
36
31
|
token = resolve_token
|
|
37
32
|
return unless token
|
|
38
33
|
|
|
39
|
-
|
|
40
|
-
Legion::Settings[:microsoft_teams] || {}
|
|
41
|
-
rescue StandardError => e
|
|
42
|
-
handle_exception(e, level: :debug, operation: 'IncrementalSync#manual settings')
|
|
43
|
-
{}
|
|
44
|
-
end
|
|
45
|
-
ingest = settings[:ingest] || {}
|
|
34
|
+
is_settings = settings[:incremental_sync]
|
|
46
35
|
runner_class.incremental_sync(
|
|
47
36
|
token: token,
|
|
48
|
-
top_people:
|
|
49
|
-
message_depth:
|
|
37
|
+
top_people: is_settings[:top_people],
|
|
38
|
+
message_depth: is_settings[:message_depth]
|
|
50
39
|
)
|
|
51
40
|
rescue StandardError => e
|
|
52
41
|
handle_exception(e, level: :error, operation: 'IncrementalSync#manual')
|
|
@@ -7,8 +7,6 @@ module Legion
|
|
|
7
7
|
class MeetingIngest < Legion::Extensions::Actors::Every
|
|
8
8
|
include Legion::Extensions::MicrosoftTeams::Helpers::Client
|
|
9
9
|
|
|
10
|
-
DEFAULT_INGEST_INTERVAL = 300
|
|
11
|
-
|
|
12
10
|
def runner_class = self.class
|
|
13
11
|
def runner_function = 'manual'
|
|
14
12
|
def run_now? = false
|
|
@@ -22,17 +20,12 @@ module Legion
|
|
|
22
20
|
end
|
|
23
21
|
|
|
24
22
|
def time
|
|
25
|
-
settings
|
|
26
|
-
Legion::Settings[:microsoft_teams] || {}
|
|
27
|
-
rescue StandardError => e
|
|
28
|
-
handle_exception(e, level: :debug, operation: 'MeetingIngest#time')
|
|
29
|
-
{}
|
|
30
|
-
end
|
|
31
|
-
settings.dig(:meetings, :ingest_interval) || DEFAULT_INGEST_INTERVAL
|
|
23
|
+
settings.dig(:meeting_ingest, :interval)
|
|
32
24
|
end
|
|
33
25
|
|
|
34
26
|
def enabled?
|
|
35
|
-
|
|
27
|
+
settings.dig(:meeting_ingest, :enabled) &&
|
|
28
|
+
Legion::Extensions::Identity::Entra::Helpers::TokenManager.respond_to?(:load_token)
|
|
36
29
|
rescue StandardError => e
|
|
37
30
|
handle_exception(e, level: :debug, operation: 'MeetingIngest#enabled?')
|
|
38
31
|
false
|
|
@@ -8,8 +8,6 @@ module Legion
|
|
|
8
8
|
include Legion::Extensions::MicrosoftTeams::Helpers::Client
|
|
9
9
|
include Legion::Extensions::MicrosoftTeams::Helpers::HighWaterMark
|
|
10
10
|
|
|
11
|
-
POLL_INTERVAL = 30
|
|
12
|
-
|
|
13
11
|
def initialize(**opts)
|
|
14
12
|
return unless enabled?
|
|
15
13
|
|
|
@@ -18,7 +16,7 @@ module Legion
|
|
|
18
16
|
|
|
19
17
|
def runner_class = Legion::Extensions::MicrosoftTeams::Runners::Bot
|
|
20
18
|
def runner_function = 'observe_message'
|
|
21
|
-
def time =
|
|
19
|
+
def time = settings.dig(:observed_chat_poller, :interval)
|
|
22
20
|
def delay = 180
|
|
23
21
|
def run_now? = false
|
|
24
22
|
def use_runner? = false
|
|
@@ -26,11 +24,9 @@ module Legion
|
|
|
26
24
|
def generate_task? = false
|
|
27
25
|
|
|
28
26
|
def enabled?
|
|
29
|
-
|
|
30
|
-
|
|
31
|
-
|
|
32
|
-
|
|
33
|
-
Legion::Settings.dig(:microsoft_teams, :bot, :observe, :enabled) == true
|
|
27
|
+
settings.dig(:observed_chat_poller, :enabled) &&
|
|
28
|
+
defined?(Legion::Extensions::MicrosoftTeams::Runners::Bot) &&
|
|
29
|
+
Legion.const_defined?(:Transport, false)
|
|
34
30
|
rescue StandardError => e
|
|
35
31
|
handle_exception(e, level: :debug, operation: 'ObservedChatPoller#enabled?')
|
|
36
32
|
false
|
|
@@ -106,12 +102,6 @@ module Legion
|
|
|
106
102
|
handle_exception(e, level: :warn, operation: 'ObservedChatPoller#delegated_token')
|
|
107
103
|
nil
|
|
108
104
|
end
|
|
109
|
-
|
|
110
|
-
def settings_interval(key, default)
|
|
111
|
-
return default unless defined?(Legion::Settings)
|
|
112
|
-
|
|
113
|
-
Legion::Settings.dig(:microsoft_teams, :bot, key) || default
|
|
114
|
-
end
|
|
115
105
|
end
|
|
116
106
|
end
|
|
117
107
|
end
|
|
@@ -7,8 +7,6 @@ module Legion
|
|
|
7
7
|
class PresencePoller < Legion::Extensions::Actors::Every
|
|
8
8
|
include Legion::Extensions::MicrosoftTeams::Helpers::Client
|
|
9
9
|
|
|
10
|
-
DEFAULT_POLL_INTERVAL = 60
|
|
11
|
-
|
|
12
10
|
def runner_class = self.class
|
|
13
11
|
def runner_function = 'manual'
|
|
14
12
|
def run_now? = false
|
|
@@ -17,13 +15,12 @@ module Legion
|
|
|
17
15
|
def generate_task? = false
|
|
18
16
|
|
|
19
17
|
def time
|
|
20
|
-
|
|
21
|
-
|
|
22
|
-
Legion::Settings.dig(:microsoft_teams, :presence, :poll_interval) || DEFAULT_POLL_INTERVAL
|
|
18
|
+
settings.dig(:presence_poller, :interval)
|
|
23
19
|
end
|
|
24
20
|
|
|
25
21
|
def enabled?
|
|
26
|
-
|
|
22
|
+
settings.dig(:presence_poller, :enabled) &&
|
|
23
|
+
Legion::Extensions::Identity::Entra::Helpers::TokenManager.respond_to?(:load_token)
|
|
27
24
|
rescue StandardError => e
|
|
28
25
|
handle_exception(e, level: :debug, operation: 'PresencePoller#enabled?')
|
|
29
26
|
false
|
|
@@ -21,7 +21,8 @@ module Legion
|
|
|
21
21
|
end
|
|
22
22
|
|
|
23
23
|
def enabled?
|
|
24
|
-
|
|
24
|
+
settings.dig(:profile_ingest, :enabled) &&
|
|
25
|
+
defined?(Legion::Extensions::Agentic::Memory::Trace::Runners::Traces) &&
|
|
25
26
|
token_available?
|
|
26
27
|
rescue StandardError => e
|
|
27
28
|
handle_exception(e, level: :debug, operation: 'ProfileIngest#enabled?')
|
|
@@ -37,17 +38,11 @@ module Legion
|
|
|
37
38
|
end
|
|
38
39
|
log.info('ProfileIngest: token acquired, starting ingest')
|
|
39
40
|
|
|
40
|
-
|
|
41
|
-
Legion::Settings[:microsoft_teams] || {}
|
|
42
|
-
rescue StandardError => e
|
|
43
|
-
handle_exception(e, level: :debug, operation: 'ProfileIngest#manual settings')
|
|
44
|
-
{}
|
|
45
|
-
end
|
|
46
|
-
ingest = settings[:ingest] || {}
|
|
41
|
+
pi_settings = settings[:profile_ingest]
|
|
47
42
|
runner_class.full_ingest(
|
|
48
43
|
token: token,
|
|
49
|
-
top_people:
|
|
50
|
-
message_depth:
|
|
44
|
+
top_people: pi_settings[:top_people],
|
|
45
|
+
message_depth: pi_settings[:message_depth]
|
|
51
46
|
)
|
|
52
47
|
rescue StandardError => e
|
|
53
48
|
handle_exception(e, level: :error, operation: 'ProfileIngest#manual')
|
|
@@ -0,0 +1,84 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
|
|
3
|
+
module Legion
|
|
4
|
+
module Extensions
|
|
5
|
+
module MicrosoftTeams
|
|
6
|
+
module Errors
|
|
7
|
+
# Raised when Microsoft Graph (or the Bot Framework) throttles the
|
|
8
|
+
# caller and the retry policy has been exhausted, or when an actor
|
|
9
|
+
# wants to surface a throttle event without retrying further.
|
|
10
|
+
#
|
|
11
|
+
# `retry_after` carries the last advertised Retry-After interval
|
|
12
|
+
# (in seconds) as parsed from the upstream header. **It is nil when
|
|
13
|
+
# the server returned no Retry-After header or one we could not
|
|
14
|
+
# parse** — callers must check `retry_after_known?` before treating
|
|
15
|
+
# the value as a server directive. Conflating "header missing" with
|
|
16
|
+
# "retry immediately" was the bug the original fleet outage exposed.
|
|
17
|
+
class Throttled < StandardError
|
|
18
|
+
attr_reader :status, :retry_after, :request, :attempts
|
|
19
|
+
|
|
20
|
+
# @param status [Integer] the upstream HTTP status (e.g. 429)
|
|
21
|
+
# @param retry_after [Float, Integer, nil] seconds the server
|
|
22
|
+
# advised waiting, or nil if the header was absent/unparseable
|
|
23
|
+
# @param request [String, nil] the path or URL that was throttled
|
|
24
|
+
# @param attempts [Integer, nil] how many retries the middleware
|
|
25
|
+
# tried before giving up; nil means "not tracked"
|
|
26
|
+
def initialize(status:, retry_after: nil, request: nil, attempts: nil)
|
|
27
|
+
@status = coerce_status(status)
|
|
28
|
+
@retry_after = coerce_retry_after(retry_after)
|
|
29
|
+
@request = request
|
|
30
|
+
@attempts = attempts.nil? ? nil : Integer(attempts)
|
|
31
|
+
super(build_message)
|
|
32
|
+
end
|
|
33
|
+
|
|
34
|
+
# @return [Boolean] true when the upstream advised a specific wait
|
|
35
|
+
# interval; false when the header was missing or unparseable. Use
|
|
36
|
+
# this to decide whether to honor the wait verbatim or apply a
|
|
37
|
+
# local policy default before re-scheduling.
|
|
38
|
+
def retry_after_known?
|
|
39
|
+
!@retry_after.nil?
|
|
40
|
+
end
|
|
41
|
+
|
|
42
|
+
private
|
|
43
|
+
|
|
44
|
+
def coerce_status(value)
|
|
45
|
+
Integer(value)
|
|
46
|
+
# rubocop:disable Legion/RescueLogging/NoCapture
|
|
47
|
+
# No logger available during exception construction; we re-raise
|
|
48
|
+
# with a clearer message instead.
|
|
49
|
+
rescue ArgumentError, TypeError
|
|
50
|
+
raise ArgumentError, "Throttled status must be an Integer, got #{value.inspect}"
|
|
51
|
+
# rubocop:enable Legion/RescueLogging/NoCapture
|
|
52
|
+
end
|
|
53
|
+
|
|
54
|
+
def coerce_retry_after(value)
|
|
55
|
+
return nil if value.nil?
|
|
56
|
+
|
|
57
|
+
seconds = Float(value)
|
|
58
|
+
seconds.negative? ? 0.0 : seconds
|
|
59
|
+
# rubocop:disable Legion/RescueLogging/NoCapture
|
|
60
|
+
# Unparseable retry_after intentionally collapses to nil so the
|
|
61
|
+
# public `retry_after_known?` predicate is the single source of
|
|
62
|
+
# truth for "did the server give us usable guidance." Logging
|
|
63
|
+
# belongs at the parse site (Faraday::RetryAfter), not here.
|
|
64
|
+
rescue ArgumentError, TypeError
|
|
65
|
+
nil
|
|
66
|
+
# rubocop:enable Legion/RescueLogging/NoCapture
|
|
67
|
+
end
|
|
68
|
+
|
|
69
|
+
def build_message
|
|
70
|
+
parts = ["Microsoft Graph throttled (HTTP #{@status})"]
|
|
71
|
+
parts << "after #{@attempts} attempt(s)" if @attempts
|
|
72
|
+
parts << if retry_after_known?
|
|
73
|
+
"retry_after=#{format('%.2f', @retry_after)}s"
|
|
74
|
+
else
|
|
75
|
+
'retry_after=unknown'
|
|
76
|
+
end
|
|
77
|
+
parts << "request=#{@request}" if @request
|
|
78
|
+
parts.join('; ')
|
|
79
|
+
end
|
|
80
|
+
end
|
|
81
|
+
end
|
|
82
|
+
end
|
|
83
|
+
end
|
|
84
|
+
end
|
|
@@ -0,0 +1,209 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
|
|
3
|
+
require 'faraday'
|
|
4
|
+
require 'time'
|
|
5
|
+
require 'legion/extensions/microsoft_teams/errors'
|
|
6
|
+
|
|
7
|
+
module Legion
|
|
8
|
+
module Extensions
|
|
9
|
+
module MicrosoftTeams
|
|
10
|
+
module Faraday
|
|
11
|
+
# Faraday middleware that retries throttled responses honoring the
|
|
12
|
+
# upstream Retry-After header per RFC 9110 §10.2.3 (originally
|
|
13
|
+
# specified in RFC 7231 §7.1.3). Retries HTTP 429 by default; 503
|
|
14
|
+
# and 504 are opt-in via `retry_statuses:`.
|
|
15
|
+
#
|
|
16
|
+
# Retry-After is parsed in two forms:
|
|
17
|
+
#
|
|
18
|
+
# * delta-seconds (e.g. "120") — used as-is
|
|
19
|
+
# * HTTP-date (e.g. "Wed, 27 May 2026 12:00:00 GMT")
|
|
20
|
+
# — converted to delta
|
|
21
|
+
# from current UTC,
|
|
22
|
+
# clamped to >= 0
|
|
23
|
+
#
|
|
24
|
+
# The advertised wait is jittered by ±(jitter * wait) to avoid
|
|
25
|
+
# thundering-herd behavior across instances sharing one Entra app
|
|
26
|
+
# registration's Graph quota.
|
|
27
|
+
#
|
|
28
|
+
# When `max_retries` is reached, or cumulative wait would exceed
|
|
29
|
+
# `max_wait`, the middleware raises `Errors::Throttled` carrying
|
|
30
|
+
# the last advertised Retry-After (nil if the header was missing or
|
|
31
|
+
# unparseable), the final HTTP status, attempt count, and request
|
|
32
|
+
# path. Raising centrally — rather than returning a raw 429 and
|
|
33
|
+
# trusting every caller to detect it — is the difference between
|
|
34
|
+
# one typed event the fleet can defer on, and 60+ runner callsites
|
|
35
|
+
# that silently treat throttle envelopes as data.
|
|
36
|
+
class RetryAfter < ::Faraday::Middleware
|
|
37
|
+
DEFAULT_MAX_RETRIES = 3
|
|
38
|
+
DEFAULT_MAX_WAIT = 60.0
|
|
39
|
+
DEFAULT_JITTER = 0.2
|
|
40
|
+
DEFAULT_FALLBACK_WAIT = 1.0
|
|
41
|
+
DEFAULT_RETRY_STATUSES = [429].freeze
|
|
42
|
+
|
|
43
|
+
# Parse an HTTP Retry-After header value.
|
|
44
|
+
#
|
|
45
|
+
# @param raw [String, nil] the raw header value
|
|
46
|
+
# @param clock [#call] a callable returning current UTC Time,
|
|
47
|
+
# injectable for deterministic tests
|
|
48
|
+
# @return [Float, nil] seconds to wait, or nil if `raw` is absent,
|
|
49
|
+
# empty, or neither a numeric delta nor a valid HTTP-date
|
|
50
|
+
def self.parse_header(raw, clock: -> { Time.now.utc })
|
|
51
|
+
return nil if raw.nil?
|
|
52
|
+
|
|
53
|
+
value = raw.to_s.strip
|
|
54
|
+
return nil if value.empty?
|
|
55
|
+
return value.to_f if value.match?(/\A\d+(\.\d+)?\z/)
|
|
56
|
+
|
|
57
|
+
begin
|
|
58
|
+
target = Time.httpdate(value).utc
|
|
59
|
+
[(target - clock.call), 0.0].max
|
|
60
|
+
# rubocop:disable Legion/RescueLogging/NoCapture
|
|
61
|
+
# Pure parser — no logger access. The instance method
|
|
62
|
+
# `parse_advertised` warns on the same condition with full
|
|
63
|
+
# context; logging twice would just be noise.
|
|
64
|
+
rescue ArgumentError
|
|
65
|
+
nil
|
|
66
|
+
# rubocop:enable Legion/RescueLogging/NoCapture
|
|
67
|
+
end
|
|
68
|
+
end
|
|
69
|
+
|
|
70
|
+
def initialize(app, # rubocop:disable Metrics/ParameterLists
|
|
71
|
+
max_retries: DEFAULT_MAX_RETRIES,
|
|
72
|
+
max_wait: DEFAULT_MAX_WAIT,
|
|
73
|
+
jitter: DEFAULT_JITTER,
|
|
74
|
+
fallback_wait: DEFAULT_FALLBACK_WAIT,
|
|
75
|
+
retry_statuses: DEFAULT_RETRY_STATUSES,
|
|
76
|
+
sleeper: ->(seconds) { sleep(seconds) },
|
|
77
|
+
logger: nil,
|
|
78
|
+
clock: -> { Time.now.utc })
|
|
79
|
+
super(app)
|
|
80
|
+
@max_retries = Integer(max_retries)
|
|
81
|
+
@max_wait = Float(max_wait)
|
|
82
|
+
@jitter = Float(jitter)
|
|
83
|
+
@fallback_wait = Float(fallback_wait)
|
|
84
|
+
@retry_statuses = Array(retry_statuses).map(&:to_i).freeze
|
|
85
|
+
@sleeper = sleeper
|
|
86
|
+
@logger = logger
|
|
87
|
+
@clock = clock
|
|
88
|
+
end
|
|
89
|
+
|
|
90
|
+
def call(env)
|
|
91
|
+
attempts = 0
|
|
92
|
+
total_wait = 0.0
|
|
93
|
+
last_advertised = nil
|
|
94
|
+
|
|
95
|
+
loop do
|
|
96
|
+
response = @app.call(env.dup)
|
|
97
|
+
return response unless retryable?(response.status)
|
|
98
|
+
|
|
99
|
+
last_advertised = parse_advertised(response)
|
|
100
|
+
wait = compute_wait(last_advertised)
|
|
101
|
+
|
|
102
|
+
if attempts >= @max_retries || (total_wait + wait) > @max_wait
|
|
103
|
+
log_giveup(env, response, attempts, total_wait)
|
|
104
|
+
raise Errors::Throttled.new(
|
|
105
|
+
status: response.status,
|
|
106
|
+
retry_after: last_advertised,
|
|
107
|
+
request: request_path(env),
|
|
108
|
+
attempts: attempts
|
|
109
|
+
)
|
|
110
|
+
end
|
|
111
|
+
|
|
112
|
+
attempts += 1
|
|
113
|
+
total_wait += wait
|
|
114
|
+
log_retry(env, response, wait, attempts)
|
|
115
|
+
@sleeper.call(wait)
|
|
116
|
+
end
|
|
117
|
+
end
|
|
118
|
+
|
|
119
|
+
private
|
|
120
|
+
|
|
121
|
+
def retryable?(status)
|
|
122
|
+
@retry_statuses.include?(status.to_i)
|
|
123
|
+
end
|
|
124
|
+
|
|
125
|
+
# Parse the advertised Retry-After value from a response. Returns
|
|
126
|
+
# nil if the header is missing or unparseable — callers branch on
|
|
127
|
+
# nil to distinguish "no server guidance" from "retry now".
|
|
128
|
+
def parse_advertised(response)
|
|
129
|
+
raw = retry_after_header(response)
|
|
130
|
+
parsed = self.class.parse_header(raw, clock: @clock)
|
|
131
|
+
@logger&.warn("[microsoft_teams][retry_after] unparseable Retry-After=#{raw.inspect}") if raw && !raw.to_s.strip.empty? && parsed.nil?
|
|
132
|
+
parsed
|
|
133
|
+
end
|
|
134
|
+
|
|
135
|
+
# Computes the actual wait the middleware will sleep before the
|
|
136
|
+
# next attempt. Falls back to `@fallback_wait` only when the
|
|
137
|
+
# server gave no usable guidance; jitter is always applied so
|
|
138
|
+
# concurrent instances don't synchronize their retries.
|
|
139
|
+
def compute_wait(advertised)
|
|
140
|
+
seconds = advertised || @fallback_wait
|
|
141
|
+
apply_jitter(seconds)
|
|
142
|
+
end
|
|
143
|
+
|
|
144
|
+
def retry_after_header(response)
|
|
145
|
+
headers = response_headers(response)
|
|
146
|
+
return nil unless headers
|
|
147
|
+
|
|
148
|
+
headers['Retry-After'] ||
|
|
149
|
+
headers['retry-after'] ||
|
|
150
|
+
headers['RETRY-AFTER']
|
|
151
|
+
end
|
|
152
|
+
|
|
153
|
+
# Faraday's Response exposes headers via #headers; some test
|
|
154
|
+
# doubles may not, so look in both common places.
|
|
155
|
+
def response_headers(response)
|
|
156
|
+
if response.respond_to?(:headers) && response.headers
|
|
157
|
+
response.headers
|
|
158
|
+
elsif response.respond_to?(:response_headers)
|
|
159
|
+
response.response_headers
|
|
160
|
+
end
|
|
161
|
+
end
|
|
162
|
+
|
|
163
|
+
def apply_jitter(seconds)
|
|
164
|
+
return seconds if @jitter.zero?
|
|
165
|
+
|
|
166
|
+
spread = seconds * @jitter
|
|
167
|
+
offset = ((rand * 2.0) - 1.0) * spread
|
|
168
|
+
wait = seconds + offset
|
|
169
|
+
wait.negative? ? 0.0 : wait
|
|
170
|
+
end
|
|
171
|
+
|
|
172
|
+
def request_path(env)
|
|
173
|
+
env.url.respond_to?(:path) ? env.url.path : env.url.to_s
|
|
174
|
+
# rubocop:disable Legion/RescueLogging/NoCapture
|
|
175
|
+
# Defensive fallback for malformed env.url; the path is only used
|
|
176
|
+
# for log lines and error messages, never for control flow.
|
|
177
|
+
rescue StandardError
|
|
178
|
+
nil
|
|
179
|
+
# rubocop:enable Legion/RescueLogging/NoCapture
|
|
180
|
+
end
|
|
181
|
+
|
|
182
|
+
def log_retry(env, response, wait, attempts)
|
|
183
|
+
return unless @logger
|
|
184
|
+
|
|
185
|
+
@logger.warn(
|
|
186
|
+
"[microsoft_teams][retry_after] status=#{response.status} " \
|
|
187
|
+
"method=#{env.method.to_s.upcase} path=#{request_path(env)} " \
|
|
188
|
+
"wait=#{format('%.2f', wait)}s attempt=#{attempts}"
|
|
189
|
+
)
|
|
190
|
+
rescue StandardError => e
|
|
191
|
+
warn("[microsoft_teams][retry_after] log_retry suppressed #{e.class}: #{e.message}")
|
|
192
|
+
end
|
|
193
|
+
|
|
194
|
+
def log_giveup(env, response, attempts, total_wait)
|
|
195
|
+
return unless @logger
|
|
196
|
+
|
|
197
|
+
@logger.error(
|
|
198
|
+
"[microsoft_teams][retry_after] giving up; status=#{response.status} " \
|
|
199
|
+
"method=#{env.method.to_s.upcase} path=#{request_path(env)} " \
|
|
200
|
+
"attempts=#{attempts} total_wait=#{format('%.2f', total_wait)}s"
|
|
201
|
+
)
|
|
202
|
+
rescue StandardError => e
|
|
203
|
+
warn("[microsoft_teams][retry_after] log_giveup suppressed #{e.class}: #{e.message}")
|
|
204
|
+
end
|
|
205
|
+
end
|
|
206
|
+
end
|
|
207
|
+
end
|
|
208
|
+
end
|
|
209
|
+
end
|