@cronicorn/mcp-server 1.4.5 → 1.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,453 @@
1
+ ---
2
+ id: how-ai-adaptation-works
3
+ title: How AI Adaptation Works
4
+ description: AI Planner, hints, tools, response bodies, and adaptation logic
5
+ tags: [assistant, technical, ai-planner]
6
+ sidebar_position: 3
7
+ mcp:
8
+ uri: file:///docs/technical/how-ai-adaptation-works.md
9
+ mimeType: text/markdown
10
+ priority: 0.85
11
+ lastModified: 2025-11-02T00:00:00Z
12
+ ---
13
+
14
+ # How AI Adaptation Works
15
+
16
+ This document explains how the AI Planner analyzes endpoint execution patterns and suggests schedule adjustments. If you haven't read [System Architecture](./system-architecture.md), start there for context on the dual-worker design.
17
+
18
+ ## The AI Planner's Job
19
+
20
+ The AI Planner worker has one responsibility: analyze endpoint execution patterns and suggest scheduling adjustments by writing hints to the database. It doesn't execute jobs, manage locks, or worry about reliability. It observes and recommends.
21
+
22
+ The AI Planner runs independently from the Scheduler. Typically it wakes up every 5 minutes (configurable) and analyzes recently active endpoints. The analysis happens asynchronously—the Scheduler doesn't wait for it.
23
+
24
+ ## Discovery: Finding Endpoints to Analyze
25
+
26
+ The AI Planner doesn't analyze every endpoint on every cycle. That would be expensive (AI API costs) and unnecessary (most endpoints are stable).
27
+
28
+ Instead, it uses a **discovery mechanism** to find endpoints that executed recently:
29
+
30
+ 1. Query the database for endpoints that ran within the last 5-10 minutes
31
+ 2. Filter to endpoints that haven't been analyzed in the last cycle (avoid duplicate work)
32
+ 3. Analyze each discovered endpoint in sequence
33
+
34
+ This approach focuses AI attention on active endpoints where adaptation might be valuable. Dormant endpoints (not running) don't get analyzed until they become active again.
35
+
36
+ The discovery window (5-10 minutes) is configurable. Wider windows catch more endpoints but increase batch size. Narrower windows reduce costs but might miss some activity.
37
+
38
+ ## What the AI Sees: Building Context
39
+
40
+ For each endpoint, the AI Planner builds a comprehensive analysis prompt containing:
41
+
42
+ ### Current Configuration
43
+
44
+ - **Baseline schedule**: Cron expression or interval
45
+ - **Constraints**: Min/max intervals
46
+ - **Pause state**: Whether paused and until when
47
+ - **Active hints**: Current AI interval/one-shot hints and their expiration times
48
+ - **Failure count**: Number of consecutive failures (affects exponential backoff)
49
+
50
+ This tells the AI what scheduling behavior is currently in effect.
51
+
52
+ ### Recent Performance (Last 24 Hours)
53
+
54
+ - **Success rate**: Percentage of successful executions
55
+ - **Total runs**: Number of executions
56
+ - **Average duration**: Mean execution time
57
+ - **Failure streak**: Consecutive failures (signals degradation)
58
+ - **Last run status**: Most recent execution outcome
59
+
60
+ These health metrics help the AI spot trends: improving, degrading, or stable.
61
+
62
+ ### Response Body Data
63
+
64
+ This is the key to intelligent adaptation. Every execution records the **response body**—the JSON returned by your endpoint.
65
+
66
+ The AI can query response data in three ways:
67
+
68
+ 1. **Latest response**: Check current state (one data point)
69
+ 2. **Response history**: Identify trends over time (multiple data points)
70
+ 3. **Sibling responses**: Coordinate across endpoints in the same job
71
+
72
+ The response body can contain any JSON structure you design. The AI looks for signals indicating:
73
+
74
+ - **Load indicators**: queue_depth, pending_count, backlog_size
75
+ - **Performance metrics**: latency, p95, processing_time
76
+ - **Error rates**: error_count, failure_rate, error_pct
77
+ - **Health flags**: healthy, available, status, state
78
+ - **Coordination signals**: ready_for_processing, dependency_status
79
+ - **Timestamps**: last_success_at, completed_at (detect staleness)
80
+
81
+ Structure your response bodies to include the metrics that matter for your use case.
82
+
83
+ ### Job Context
84
+
85
+ If the endpoint belongs to a job, the AI sees the job's description. This provides high-level intent:
86
+
87
+ > "Monitors payment queue and triggers processing when depth exceeds threshold"
88
+
89
+ The AI uses this context to interpret what "good" vs "bad" looks like for specific metrics. A growing queue_depth might be normal for a collector endpoint but alarming for a processor endpoint.
90
+
91
+ ## The Three Tools: How AI Takes Action
92
+
93
+ The AI Planner doesn't write to the database directly. Instead, it has access to **three action tools** that write hints:
94
+
95
+ ### 1. propose_interval
96
+
97
+ **Purpose**: Adjust how frequently the endpoint runs
98
+
99
+ **Parameters**:
100
+ - `intervalMs`: New interval in milliseconds
101
+ - `ttlMinutes`: How long the hint is valid (default: 60 minutes)
102
+ - `reason`: Optional explanation
103
+
104
+ **When to use**:
105
+ - Tighten monitoring during load spikes (5min → 30sec)
106
+ - Relax during stability (1min → 10min to save resources)
107
+ - Override exponential backoff after recovery (restore normal cadence)
108
+
109
+ **Example**:
110
+ ```
111
+ AI sees queue_depth growing: 50 → 100 → 200 over last 10 runs
112
+ Action: propose_interval(30000, ttl=15, reason="Growing queue requires tighter monitoring")
113
+ Effect: Runs every 30 seconds for 15 minutes, then reverts to baseline
114
+ ```
115
+
116
+ **How it works**:
117
+ 1. AI calls the tool with parameters
118
+ 2. Tool writes `aiHintIntervalMs` and `aiHintExpiresAt` to database
119
+ 3. Tool calls `setNextRunAtIfEarlier(now + intervalMs)` to apply immediately (nudging)
120
+ 4. Scheduler's next tick reads the hint, Governor uses it to calculate next run
121
+ 5. After TTL expires, hint is ignored and baseline resumes
122
+
123
+ ### 2. propose_next_time
124
+
125
+ **Purpose**: Schedule a specific one-time execution
126
+
127
+ **Parameters**:
128
+ - `nextRunAtIso`: ISO 8601 timestamp for next run
129
+ - `ttlMinutes`: How long the hint is valid (default: 30 minutes)
130
+ - `reason`: Optional explanation
131
+
132
+ **When to use**:
133
+ - Run immediately to investigate a failure (now)
134
+ - Defer to off-peak hours (specific timestamp)
135
+ - Coordinate with external events (batch completion time)
136
+
137
+ **Example**:
138
+ ```
139
+ AI sees failure spike (success rate drops to 45%)
140
+ Action: propose_next_time(now, ttl=5, reason="Investigate failure spike")
141
+ Effect: Runs immediately once, then resumes baseline schedule
142
+ ```
143
+
144
+ **How it works**:
145
+ 1. AI calls the tool with a timestamp
146
+ 2. Tool writes `aiHintNextRunAt` and `aiHintExpiresAt` to database
147
+ 3. Tool calls `setNextRunAtIfEarlier(timestamp)` to apply immediately
148
+ 4. Scheduler claims endpoint when `nextRunAt` arrives
149
+ 5. After execution or TTL expiry, hint is cleared and baseline resumes
150
+
151
+ ### 3. pause_until
152
+
153
+ **Purpose**: Stop execution temporarily or resume
154
+
155
+ **Parameters**:
156
+ - `untilIso`: ISO 8601 timestamp to pause until, or `null` to resume
157
+ - `reason`: Optional explanation
158
+
159
+ **When to use**:
160
+ - Dependency is down (pause until it recovers)
161
+ - Rate limit detected (pause for cooldown period)
162
+ - Maintenance window (pause until completion)
163
+ - Resume after manual pause (pass `null`)
164
+
165
+ **Example**:
166
+ ```
167
+ AI sees responseBody: { dependency_status: "unavailable" }
168
+ Action: pause_until("2025-11-02T15:30:00Z", reason="Dependency unavailable")
169
+ Effect: No executions until 3:30 PM, then resumes baseline
170
+ ```
171
+
172
+ **How it works**:
173
+ 1. AI calls the tool with a timestamp (or `null`)
174
+ 2. Tool writes `pausedUntil` to database
175
+ 3. Scheduler's Governor checks pause state—if `pausedUntil > now`, returns that timestamp with source `"paused"`
176
+ 4. When pause time passes, Governor resumes normal scheduling
177
+
178
+ ## Query Tools: Informing Decisions
179
+
180
+ Before taking action, the AI can query response data using **three query tools**:
181
+
182
+ ### 1. get_latest_response
183
+
184
+ **Returns**: Most recent response body, timestamp, status
185
+
186
+ **Use case**: Check current state snapshot
187
+
188
+ Example: "What's the current queue depth?"
189
+
190
+ ### 2. get_response_history
191
+
192
+ **Parameters**:
193
+ - `limit`: Number of responses (1-10, default 2)
194
+ - `offset`: Skip N newest responses for pagination
195
+
196
+ **Returns**: Array of response bodies with timestamps, newest first
197
+
198
+ **Use case**: Identify trends over time
199
+
200
+ Example: "Is queue_depth increasing monotonically?"
201
+
202
+ **Efficiency tip**: Start with `limit=2` to check recent trend. If you need more context, use `offset=2, limit=3` to get the next 3 older responses.
203
+
204
+ Response bodies are truncated at 1000 characters to prevent token overflow.
205
+
206
+ ### 3. get_sibling_latest_responses
207
+
208
+ **Returns**: Latest response from each sibling endpoint in the same job
209
+
210
+ **Use case**: Coordinate across endpoints
211
+
212
+ Example: "Did the upstream endpoint finish processing?"
213
+
214
+ Only useful for workflow endpoints (multiple endpoints in the same job that coordinate).
215
+
216
+ ## Hint Mechanics: TTLs and Expiration
217
+
218
+ All AI hints have **time-to-live (TTL)**—they expire automatically. This is a safety mechanism.
219
+
220
+ **Why TTLs matter**:
221
+ - If the AI makes a bad decision (too aggressive, too relaxed), it auto-corrects
222
+ - Temporary conditions (spikes, failures) don't permanently alter schedules
223
+ - You can experiment with aggressive hints knowing they'll revert
224
+
225
+ **TTL strategy**:
226
+ - **Short (5-15 minutes)**: Transient spikes, immediate investigations
227
+ - **Medium (30-60 minutes)**: Operational shifts, business hours patterns
228
+ - **Long (2-4 hours)**: Sustained degradation, maintenance windows
229
+
230
+ When a hint expires (`aiHintExpiresAt <= now`):
231
+ 1. Scheduler's Governor ignores it during next run calculation
232
+ 2. Baseline schedule resumes
233
+ 3. Hint fields remain in database (for debugging) until next execution clears them
234
+
235
+ ## Override vs. Compete: Hint Semantics
236
+
237
+ Understanding how hints interact with baseline is critical:
238
+
239
+ **AI interval hints OVERRIDE baseline**:
240
+ - If hint is active, Governor ignores baseline completely
241
+ - Enables tightening (5min → 30sec) and relaxing (1min → 10min)
242
+ - Baseline only applies when hint expires
243
+
244
+ **AI one-shot hints COMPETE with baseline**:
245
+ - Governor chooses earliest between hint timestamp and baseline next run
246
+ - Enables immediate runs (one-shot sooner than baseline)
247
+ - Baseline still influences scheduling
248
+
249
+ **When both hints exist**:
250
+ - They compete with each other (earliest wins)
251
+ - Baseline is ignored entirely
252
+
253
+ This design allows the AI to both tighten/relax ongoing cadence (interval) and trigger specific executions (one-shot) without the hints fighting each other.
254
+
255
+ ## Nudging: Immediate Effect
256
+
257
+ When the AI writes a hint, it doesn't just sit in the database waiting for the next baseline execution. That could take minutes or hours.
258
+
259
+ Instead, the AI **nudges** the `nextRunAt` field using `setNextRunAtIfEarlier(timestamp)`:
260
+
261
+ 1. Calculate when the hint should take effect (`now + intervalMs` or specific timestamp)
262
+ 2. Compare with current `nextRunAt`
263
+ 3. If hint time is earlier, update `nextRunAt` immediately
264
+ 4. Scheduler claims endpoint on next tick (within 5 seconds)
265
+
266
+ **Example timeline**:
267
+
268
+ ```
269
+ T=0: Endpoint scheduled to run at T=300 (5 minutes from now)
270
+ T=60: AI analyzes, sees spike, proposes 30-second interval
271
+ T=60: AI writes hint and nudges nextRunAt to T=90 (30 seconds from T=60)
272
+ T=65: Scheduler wakes up, claims endpoint (nextRunAt=T=90 is due soon)
273
+ T=90: Scheduler executes endpoint
274
+ ```
275
+
276
+ Without nudging, the endpoint would wait until T=300 to apply the new interval. With nudging, it applies at T=90—within seconds of the AI's decision.
277
+
278
+ **Important**: Nudging respects constraints. If the nudged time violates `minIntervalMs`, the Governor clamps it during scheduling.
279
+
280
+ ## Structuring Response Bodies for AI
281
+
282
+ The AI can only work with the data you provide. Here's how to structure response bodies:
283
+
284
+ ### Include Relevant Metrics
285
+
286
+ ```json
287
+ {
288
+ "queue_depth": 250,
289
+ "processing_rate_per_min": 80,
290
+ "error_rate_pct": 2.5,
291
+ "avg_latency_ms": 145,
292
+ "healthy": true,
293
+ "last_success_at": "2025-11-02T14:30:00Z"
294
+ }
295
+ ```
296
+
297
+ The AI looks for field names like `queue`, `latency`, `error`, `rate`, `count`, `healthy`, `status`.
298
+
299
+ ### Use Consistent Naming
300
+
301
+ If queue depth is `queue_depth` in one response, don't call it `pending_items` in another. Consistency helps the AI spot trends.
302
+
303
+ ### Include Thresholds
304
+
305
+ ```json
306
+ {
307
+ "queue_depth": 250,
308
+ "queue_max": 500,
309
+ "queue_warning_threshold": 300
310
+ }
311
+ ```
312
+
313
+ This tells the AI when to intervene (crossing thresholds).
314
+
315
+ ### Add Coordination Signals
316
+
317
+ For workflow endpoints:
318
+
319
+ ```json
320
+ {
321
+ "ready_for_processing": true,
322
+ "upstream_completed_at": "2025-11-02T14:45:00Z",
323
+ "data_available": true
324
+ }
325
+ ```
326
+
327
+ The AI can use `get_sibling_latest_responses` to read these flags and coordinate execution.
328
+
329
+ ### Include Timestamps
330
+
331
+ ```json
332
+ {
333
+ "last_alert_sent_at": "2025-11-02T12:00:00Z",
334
+ "last_cache_warm_at": "2025-11-02T13:30:00Z"
335
+ }
336
+ ```
337
+
338
+ This enables **cooldown patterns**—the AI can check if enough time has passed before triggering duplicate actions.
339
+
340
+ ### Keep It Simple
341
+
342
+ The AI receives truncated response bodies (1000 chars). Don't include massive arrays or deeply nested objects. Focus on summary metrics.
343
+
344
+ ## Decision Framework: When AI Acts
345
+
346
+ The AI follows a conservative decision framework:
347
+
348
+ **Default to stability**: Most of the time, doing nothing is optimal.
349
+
350
+ **Intervene when**:
351
+ - Clear trend (5+ data points showing monotonic change)
352
+ - Threshold crossing (metric jumps significantly)
353
+ - State transition (dependency status changes)
354
+ - Coordination signal (sibling endpoint signals readiness)
355
+
356
+ **Don't intervene when**:
357
+ - Single anomaly (might be transient)
358
+ - Insufficient data (&lt;10 total runs)
359
+ - Metrics within normal ranges
360
+ - No clear cause-effect logic
361
+
362
+ The AI's reasoning is logged in `ai_analysis_sessions` table. You can review what it considered and why it acted (or didn't).
363
+
364
+ ## Analysis Sessions: Debugging AI Decisions
365
+
366
+ Every AI analysis creates a session record:
367
+
368
+ - **Endpoint analyzed**
369
+ - **Timestamp**
370
+ - **Tools called** (which queries and actions)
371
+ - **Reasoning** (AI's explanation of its decision)
372
+ - **Token usage** (cost tracking)
373
+ - **Duration**
374
+
375
+ This audit trail helps you:
376
+ - Debug unexpected scheduling behavior
377
+ - Understand why AI tightened/relaxed intervals
378
+ - Review cost (token usage per analysis)
379
+ - Tune prompts or constraints based on AI reasoning
380
+
381
+ Check the sessions table when an endpoint's schedule changes unexpectedly. The reasoning field shows the AI's thought process.
382
+
383
+ ## Quota and Cost Control
384
+
385
+ AI analysis costs money (API calls). The system includes quota controls:
386
+
387
+ - Per-tenant quota limits (prevent runaway costs)
388
+ - Before analyzing an endpoint, check `quota.canProceed(tenantId)`
389
+ - If quota exceeded, skip analysis (Scheduler continues on baseline)
390
+
391
+ This ensures that even if AI becomes unavailable or too expensive, your jobs keep running.
392
+
393
+ ## Putting It Together: Example Analysis
394
+
395
+ **Scenario**: Payment queue monitoring endpoint
396
+
397
+ **Baseline**: 5-minute interval
398
+
399
+ **T=0**: Scheduler runs endpoint
400
+ - Response: `{ "queue_depth": 50, "processing_rate": 100, "healthy": true }`
401
+ - Records to database
402
+
403
+ **T=5min**: AI Planner discovers endpoint (ran recently)
404
+ - Queries `get_latest_response`: queue_depth=50
405
+ - Queries `get_response_history(limit=5)`: [50, 48, 52, 49, 51]
406
+ - Analysis: "Stable around 50, no trend, 100% success rate"
407
+ - Decision: "No action—stability optimal"
408
+ - Session logged with reasoning
409
+
410
+ **T=10min**: Scheduler runs again
411
+ - Response: `{ "queue_depth": 150, "processing_rate": 100, "healthy": true }`
412
+ - Records to database
413
+
414
+ **T=12min**: AI Planner analyzes again
415
+ - Queries `get_response_history(limit=5)`: [150, 50, 48, 52, 49]
416
+ - Analysis: "Queue jumped from 50 to 150. Single spike or trend?"
417
+ - Queries `get_response_history(limit=2, offset=5)`: [51, 50]
418
+ - Analysis: "Stable before, then spike. Monitor closely."
419
+ - Decision: `propose_interval(60000, ttl=15, reason="Queue spike detected")`
420
+ - Nudges `nextRunAt` to T=13min (1 minute from now)
421
+
422
+ **T=13min**: Scheduler claims endpoint (nudged)
423
+ - Executes, gets response: `{ "queue_depth": 200, ... }`
424
+ - Governor sees active interval hint (60000ms)
425
+ - Schedules next run at T=14min (1 minute interval)
426
+
427
+ **T=14min, T=15min, ..., T=27min**: Runs every 1 minute
428
+ - AI hint remains active (TTL=15 minutes from T=12min = expires T=27min)
429
+
430
+ **T=27min**: Hint expires
431
+ - Scheduler's Governor sees `aiHintExpiresAt < now`
432
+ - Ignores hint, uses baseline (5-minute interval)
433
+ - Schedules next run at T=32min
434
+
435
+ ## Key Takeaways
436
+
437
+ 1. **AI discovers active endpoints**: Only analyzes what's running
438
+ 2. **AI sees health + response data**: Metrics inform decisions
439
+ 3. **Three action tools**: propose_interval, propose_next_time, pause_until
440
+ 4. **Hints have TTLs**: Auto-revert on expiration (safety)
441
+ 5. **Interval hints override baseline**: Enables adaptation
442
+ 6. **Nudging provides immediacy**: Changes apply within seconds
443
+ 7. **Structure response bodies intentionally**: Include metrics AI should monitor
444
+ 8. **Sessions provide audit trail**: Debug AI reasoning
445
+ 9. **Quota controls costs**: AI unavailable ≠ jobs stop running
446
+
447
+ Understanding how AI adaptation works helps you design endpoints that benefit from intelligent scheduling, structure response bodies effectively, and debug unexpected schedule changes.
448
+
449
+ ## Next Steps
450
+
451
+ - **[Coordinating Multiple Endpoints](./coordinating-multiple-endpoints.md)** - Use AI to orchestrate workflows
452
+ - **[Configuration and Constraints](./configuration-and-constraints.md)** - Set up endpoints for optimal AI behavior
453
+ - **[Reference](./reference.md)** - Quick lookup for tools, fields, and defaults