@cronicorn/mcp-server 1.18.3 → 1.19.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +87 -436
- package/dist/docs/api-reference.md +558 -0
- package/dist/docs/core-concepts.md +9 -1
- package/dist/docs/introduction.md +15 -5
- package/dist/docs/mcp-server.md +49 -69
- package/dist/docs/quick-start.md +11 -2
- package/dist/docs/self-hosting.md +10 -1
- package/dist/docs/technical/configuration-and-constraints.md +11 -2
- package/dist/docs/technical/coordinating-multiple-endpoints.md +11 -2
- package/dist/docs/technical/how-ai-adaptation-works.md +122 -385
- package/dist/docs/technical/how-scheduling-works.md +76 -2
- package/dist/docs/technical/reference.md +11 -2
- package/dist/docs/technical/system-architecture.md +57 -189
- package/dist/docs/troubleshooting.md +392 -0
- package/dist/docs/use-cases.md +10 -1
- package/dist/index.js +20 -12
- package/dist/index.js.map +1 -1
- package/package.json +1 -1
- package/dist/docs/competitive-analysis.md +0 -324
- package/dist/docs/developers/README.md +0 -29
- package/dist/docs/developers/authentication.md +0 -121
- package/dist/docs/developers/environment-configuration.md +0 -103
- package/dist/docs/developers/quality-checks.md +0 -68
- package/dist/docs/developers/quick-start.md +0 -87
- package/dist/docs/developers/workspace-structure.md +0 -174
|
@@ -7,12 +7,14 @@ sidebar_position: 2
|
|
|
7
7
|
mcp:
|
|
8
8
|
uri: file:///docs/technical/how-scheduling-works.md
|
|
9
9
|
mimeType: text/markdown
|
|
10
|
-
priority: 0.
|
|
11
|
-
lastModified:
|
|
10
|
+
priority: 0.90
|
|
11
|
+
lastModified: 2026-02-03T00:00:00Z
|
|
12
12
|
---
|
|
13
13
|
|
|
14
14
|
# How Scheduling Works
|
|
15
15
|
|
|
16
|
+
**TL;DR:** The Scheduler claims due endpoints, executes them, records results, and uses the Governor (a pure function) to calculate the next run time. AI hints override baseline schedules, constraints are hard limits, and the system includes safety mechanisms for locks, failures, and zombie runs.
|
|
17
|
+
|
|
16
18
|
This document explains how the Scheduler worker executes jobs and calculates next run times. If you haven't read [System Architecture](./system-architecture.md), start there for context on the dual-worker design.
|
|
17
19
|
|
|
18
20
|
## The Scheduler's Job
|
|
@@ -183,6 +185,70 @@ The database update is atomic. If two Schedulers somehow claimed the same endpoi
|
|
|
183
185
|
|
|
184
186
|
After the update, the endpoint's lock expires naturally (when `_lockedUntil` passes), and it becomes claimable again when `nextRunAt` arrives.
|
|
185
187
|
|
|
188
|
+
## Distributed Locks and Single Execution Guarantee
|
|
189
|
+
|
|
190
|
+
A critical requirement for any job scheduler is ensuring each job runs **exactly once** per scheduled time—even when multiple Scheduler instances run concurrently for high availability.
|
|
191
|
+
|
|
192
|
+
Cronicorn uses **database-level distributed locks** via PostgreSQL's atomic operations to achieve this guarantee.
|
|
193
|
+
|
|
194
|
+
### How Distributed Locks Work
|
|
195
|
+
|
|
196
|
+
When the Scheduler claims endpoints, it uses an **atomic conditional update**:
|
|
197
|
+
|
|
198
|
+
```sql
|
|
199
|
+
UPDATE job_endpoints
|
|
200
|
+
SET _lockedUntil = now() + lockTtlMs
|
|
201
|
+
WHERE nextRunAt <= now()
|
|
202
|
+
AND _lockedUntil <= now()
|
|
203
|
+
RETURNING id
|
|
204
|
+
```
|
|
205
|
+
|
|
206
|
+
This query atomically:
|
|
207
|
+
1. Finds endpoints that are due (`nextRunAt <= now`)
|
|
208
|
+
2. Checks they're not already locked (`_lockedUntil <= now`)
|
|
209
|
+
3. Acquires the lock by setting `_lockedUntil`
|
|
210
|
+
4. Returns only the IDs that were successfully claimed
|
|
211
|
+
|
|
212
|
+
Because PostgreSQL executes this as a single atomic operation, **only one Scheduler instance can claim each endpoint**, even if multiple instances query simultaneously.
|
|
213
|
+
|
|
214
|
+
### Multi-Instance Behavior
|
|
215
|
+
|
|
216
|
+
When running multiple Scheduler instances:
|
|
217
|
+
|
|
218
|
+
| Time | Scheduler A | Scheduler B | Endpoint State |
|
|
219
|
+
|------|-------------|-------------|----------------|
|
|
220
|
+
| T=0 | Claims ep_123 | Attempts claim | `_lockedUntil = T+30s` |
|
|
221
|
+
| T=0 | Gets ep_123 | Gets nothing | Locked by A |
|
|
222
|
+
| T=5s | Executing | Skips ep_123 | Still locked |
|
|
223
|
+
| T=10s | Completes, releases | Available but not due | `nextRunAt = T+300s` |
|
|
224
|
+
|
|
225
|
+
**Result**: Endpoint ep_123 executes exactly once, by Scheduler A.
|
|
226
|
+
|
|
227
|
+
### Lock TTL and Crash Recovery
|
|
228
|
+
|
|
229
|
+
Locks have a short **Time-To-Live** (default: 30 seconds). This enables crash recovery:
|
|
230
|
+
|
|
231
|
+
**If Scheduler A crashes mid-execution:**
|
|
232
|
+
1. The lock remains until `_lockedUntil` expires
|
|
233
|
+
2. After 30 seconds, Scheduler B can claim the endpoint
|
|
234
|
+
3. The run is marked as failed (timeout/zombie)
|
|
235
|
+
4. The endpoint recovers automatically
|
|
236
|
+
|
|
237
|
+
This means endpoints **can't get permanently stuck**. At worst, there's a delay equal to the lock TTL before another instance picks up the work.
|
|
238
|
+
|
|
239
|
+
### Single Execution Summary
|
|
240
|
+
|
|
241
|
+
| Guarantee | Mechanism |
|
|
242
|
+
|-----------|-----------|
|
|
243
|
+
| No double execution | Atomic `UPDATE...WHERE _lockedUntil <= now` |
|
|
244
|
+
| Crash recovery | Lock TTL expires, another instance claims |
|
|
245
|
+
| Multi-instance safety | PostgreSQL transaction isolation |
|
|
246
|
+
| Audit trail | Run records show which instance executed |
|
|
247
|
+
|
|
248
|
+
This design ensures reliable, exactly-once execution across any number of Scheduler instances.
|
|
249
|
+
|
|
250
|
+
---
|
|
251
|
+
|
|
186
252
|
## Safety Mechanisms
|
|
187
253
|
|
|
188
254
|
The Scheduler includes several safety mechanisms to handle edge cases:
|
|
@@ -261,3 +327,11 @@ This is the power of database-mediated communication: the Scheduler and AI Plann
|
|
|
261
327
|
6. **Sources provide auditability**: Every decision is traceable
|
|
262
328
|
|
|
263
329
|
Understanding how scheduling works gives you the foundation to configure endpoints effectively, debug unexpected behavior, and reason about how AI adaptation affects execution timing.
|
|
330
|
+
|
|
331
|
+
---
|
|
332
|
+
|
|
333
|
+
## See Also
|
|
334
|
+
|
|
335
|
+
- **[System Architecture](./system-architecture.md)** - High-level dual-worker design
|
|
336
|
+
- **[How AI Adaptation Works](./how-ai-adaptation-works.md)** - AI tools, response body design, and decision framework
|
|
337
|
+
- **[Configuration and Constraints](./configuration-and-constraints.md)** - Setting up endpoints effectively
|
|
@@ -7,8 +7,8 @@ sidebar_position: 6
|
|
|
7
7
|
mcp:
|
|
8
8
|
uri: file:///docs/technical/reference.md
|
|
9
9
|
mimeType: text/markdown
|
|
10
|
-
priority: 0.
|
|
11
|
-
lastModified:
|
|
10
|
+
priority: 0.90
|
|
11
|
+
lastModified: 2026-02-03T00:00:00Z
|
|
12
12
|
---
|
|
13
13
|
|
|
14
14
|
# Reference
|
|
@@ -458,3 +458,12 @@ Calculate current backoff multiplier:
|
|
|
458
458
|
```
|
|
459
459
|
failureCount > 0 ? 2^min(failureCount, 5) : 1
|
|
460
460
|
```
|
|
461
|
+
|
|
462
|
+
---
|
|
463
|
+
|
|
464
|
+
## See Also
|
|
465
|
+
|
|
466
|
+
- **[How Scheduling Works](./how-scheduling-works.md)** - Detailed Governor logic
|
|
467
|
+
- **[How AI Adaptation Works](./how-ai-adaptation-works.md)** - AI tools and decision framework
|
|
468
|
+
- **[Configuration and Constraints](./configuration-and-constraints.md)** - Practical configuration guidance
|
|
469
|
+
- **[Troubleshooting](../troubleshooting.md)** - Debugging guide
|
|
@@ -7,229 +7,108 @@ sidebar_position: 1
|
|
|
7
7
|
mcp:
|
|
8
8
|
uri: file:///docs/technical/system-architecture.md
|
|
9
9
|
mimeType: text/markdown
|
|
10
|
-
priority: 0.
|
|
11
|
-
lastModified:
|
|
10
|
+
priority: 0.80
|
|
11
|
+
lastModified: 2026-02-03T00:00:00Z
|
|
12
12
|
---
|
|
13
13
|
|
|
14
14
|
# System Architecture
|
|
15
15
|
|
|
16
|
-
**TL;DR:** Cronicorn uses two independent workers (Scheduler and AI Planner) that communicate only through a shared database. The Scheduler executes jobs reliably on schedule, while the AI Planner analyzes execution patterns and suggests schedule adjustments through time-bounded hints.
|
|
16
|
+
**TL;DR:** Cronicorn uses two independent workers (Scheduler and AI Planner) that communicate only through a shared database. The Scheduler executes jobs reliably on schedule, while the AI Planner analyzes execution patterns and suggests schedule adjustments through time-bounded hints.
|
|
17
17
|
|
|
18
18
|
---
|
|
19
19
|
|
|
20
20
|
## The Big Picture
|
|
21
21
|
|
|
22
|
-
Cronicorn uses a **dual-worker architecture** where job scheduling is both reliable and intelligent
|
|
22
|
+
Cronicorn uses a **dual-worker architecture** where job scheduling is both reliable and intelligent:
|
|
23
23
|
|
|
24
24
|
1. **The Scheduler Worker** - Executes jobs on schedule
|
|
25
25
|
2. **The AI Planner Worker** - Analyzes patterns and suggests schedule adjustments
|
|
26
26
|
|
|
27
|
-
These workers never communicate directly. All coordination happens through the database.
|
|
27
|
+
These workers never communicate directly. All coordination happens through the database.
|
|
28
28
|
|
|
29
29
|
## Why Two Workers?
|
|
30
30
|
|
|
31
|
-
When you build intelligence directly into a scheduler, every job execution must
|
|
31
|
+
When you build intelligence directly into a scheduler, every job execution must analyze history, call AI models, wait for responses, and handle failures—all in the critical path. If the AI is slow, jobs run late. If the AI crashes, the scheduler might crash.
|
|
32
32
|
|
|
33
|
-
|
|
34
|
-
- Make API calls to AI models
|
|
35
|
-
- Wait for responses
|
|
36
|
-
- Process recommendations
|
|
37
|
-
- Handle AI failures gracefully
|
|
38
|
-
|
|
39
|
-
All of this happens *in the critical path*. If the AI is slow, jobs run late. If the AI crashes, the scheduler might crash. If you update AI logic, you risk breaking job execution.
|
|
40
|
-
|
|
41
|
-
Cronicorn separates execution from decision-making. The Scheduler executes endpoints reliably and on time. The AI Planner analyzes patterns and suggests adjustments. Neither worker depends on the other.
|
|
42
|
-
|
|
43
|
-
This separation provides:
|
|
33
|
+
Cronicorn separates execution from decision-making:
|
|
44
34
|
|
|
45
35
|
- **Reliability**: The Scheduler keeps running even if AI fails
|
|
46
36
|
- **Performance**: Jobs execute immediately without waiting for AI analysis
|
|
47
37
|
- **Safety**: Bugs in AI logic can't break job execution
|
|
48
|
-
- **
|
|
49
|
-
- **Scalability**: We can scale Schedulers and AI Planners separately based on load
|
|
38
|
+
- **Scalability**: Scale Schedulers and AI Planners independently based on load
|
|
50
39
|
|
|
51
|
-
## How Workers Communicate
|
|
40
|
+
## How Workers Communicate
|
|
52
41
|
|
|
53
|
-
|
|
54
|
-
|
|
55
|
-
Here's how it works:
|
|
42
|
+
Workers coordinate through **shared database state**. The database is both storage and message bus.
|
|
56
43
|
|
|
57
44
|
### The Scheduler's Perspective
|
|
58
45
|
|
|
59
|
-
|
|
60
|
-
|
|
61
|
-
|
|
62
|
-
|
|
63
|
-
|
|
64
|
-
|
|
65
|
-
|
|
66
|
-
After recording results, the Scheduler calculates when each endpoint should run next and updates the `nextRunAt` field. Then it goes back to sleep for 5 seconds.
|
|
46
|
+
Every 5 seconds, the Scheduler:
|
|
47
|
+
1. Claims due endpoints from the database
|
|
48
|
+
2. Executes each endpoint's HTTP request
|
|
49
|
+
3. Writes results back (status, duration, response body)
|
|
50
|
+
4. Calculates next run time using the Governor
|
|
51
|
+
5. Updates `nextRunAt` and goes back to sleep
|
|
67
52
|
|
|
68
|
-
|
|
53
|
+
The Scheduler doesn't analyze patterns or make AI decisions. It executes, records, schedules, repeat.
|
|
69
54
|
|
|
70
55
|
### The AI Planner's Perspective
|
|
71
56
|
|
|
72
|
-
The AI Planner wakes up
|
|
57
|
+
The AI Planner wakes up periodically and analyzes recently active endpoints. For each endpoint, it:
|
|
58
|
+
1. Reads execution history (success rates, response bodies, failure streaks)
|
|
59
|
+
2. Sends context to an AI model
|
|
60
|
+
3. Writes **hints** to the database—temporary scheduling suggestions with expiration times
|
|
73
61
|
|
|
74
|
-
|
|
75
|
-
- Success rates over the last 24 hours
|
|
76
|
-
- Recent response bodies
|
|
77
|
-
- Current failure streaks
|
|
78
|
-
- Existing schedule configuration
|
|
62
|
+
The AI Planner doesn't execute jobs or manage locks. It analyzes and suggests.
|
|
79
63
|
|
|
80
|
-
|
|
81
|
-
- "Tighten the interval to 30 seconds because load is increasing" (writes an interval hint)
|
|
82
|
-
- "Run this immediately to investigate an issue" (writes a one-shot hint)
|
|
83
|
-
- "Pause until the maintenance window ends" (sets pausedUntil)
|
|
84
|
-
- "Everything looks good, no changes needed" (does nothing)
|
|
85
|
-
|
|
86
|
-
Any decisions get written to the database as **hints**—temporary scheduling suggestions with expiration times. Then the Planner moves to the next endpoint.
|
|
87
|
-
|
|
88
|
-
Notice what the AI Planner *doesn't* do: execute jobs, manage locks, or worry about reliability. It analyzes and suggests.
|
|
89
|
-
|
|
90
|
-
### The Database as Coordination Medium
|
|
64
|
+
### Database as Coordination Medium
|
|
91
65
|
|
|
92
66
|
This database-mediated architecture means:
|
|
93
67
|
|
|
94
|
-
|
|
95
|
-
|
|
96
|
-
|
|
97
|
-
|
|
98
|
-
3. **Fault-tolerant**: If the AI Planner crashes, the Scheduler keeps running jobs on baseline schedules. When AI comes back, it resumes making recommendations.
|
|
99
|
-
|
|
100
|
-
4. **Scalable**: Want faster AI analysis? Run more AI Planner instances. Want to handle more job executions? Run more Scheduler instances. They scale independently.
|
|
68
|
+
- **Eventually consistent**: The Scheduler might execute a job before the AI has analyzed its previous run
|
|
69
|
+
- **Non-blocking**: The Scheduler never waits for AI—it reads hints already in the database
|
|
70
|
+
- **Fault-tolerant**: If AI crashes, the Scheduler keeps running on baseline schedules
|
|
71
|
+
- **Scalable**: Scale Schedulers and AI Planners independently
|
|
101
72
|
|
|
102
|
-
##
|
|
103
|
-
|
|
104
|
-
Understanding how the system works requires understanding the three types of scheduling information stored in the database:
|
|
73
|
+
## Three Types of Scheduling Information
|
|
105
74
|
|
|
106
75
|
### 1. Baseline Schedule (Permanent)
|
|
107
76
|
|
|
108
|
-
|
|
109
|
-
- A cron expression: `"0 */5 * * *"` (every 5 minutes)
|
|
110
|
-
- A fixed interval: `300000` milliseconds
|
|
77
|
+
What you configure when creating an endpoint:
|
|
78
|
+
- A cron expression: `"0 */5 * * *"` (every 5 minutes)
|
|
79
|
+
- A fixed interval: `300000` milliseconds
|
|
111
80
|
|
|
112
|
-
The baseline
|
|
81
|
+
The baseline never expires or changes unless you update it.
|
|
113
82
|
|
|
114
83
|
### 2. AI Hints (Temporary, Time-Bounded)
|
|
115
84
|
|
|
116
|
-
|
|
85
|
+
Recommendations from the AI Planner with automatic expiration:
|
|
117
86
|
|
|
118
87
|
**Interval Hints**: "Run every 30 seconds for the next hour"
|
|
119
88
|
- Used when AI wants to change run frequency
|
|
120
|
-
- Has a TTL (time-to-live)—expires after N minutes
|
|
121
89
|
- Example: Tightening monitoring during a load spike
|
|
122
90
|
|
|
123
|
-
**One-Shot Hints**: "Run at 2:30 PM today"
|
|
91
|
+
**One-Shot Hints**: "Run at 2:30 PM today"
|
|
124
92
|
- Used when AI wants to trigger a specific execution
|
|
125
|
-
- Has a TTL—expires if not used within N minutes
|
|
126
93
|
- Example: Immediate investigation of a failure
|
|
127
94
|
|
|
128
|
-
|
|
95
|
+
When hints expire, the system falls back to baseline. This is a safety mechanism.
|
|
129
96
|
|
|
130
97
|
### 3. Pause State (Manual Override)
|
|
131
98
|
|
|
132
|
-
|
|
133
|
-
- Maintenance windows
|
|
134
|
-
- Temporarily disabling misbehaving endpoints
|
|
135
|
-
- Coordinating with external system downtime
|
|
136
|
-
|
|
137
|
-
Setting `pausedUntil = null` resumes the endpoint immediately.
|
|
138
|
-
|
|
139
|
-
## How Adaptation Happens
|
|
140
|
-
|
|
141
|
-
Let's walk through a concrete example.
|
|
142
|
-
|
|
143
|
-
**Scenario**: You have a traffic monitoring endpoint checking visitor counts every 5 minutes (baseline interval).
|
|
144
|
-
|
|
145
|
-
**T=0**: Normal day, 2,000 visitors per minute
|
|
146
|
-
- Scheduler runs the endpoint on its 5-minute baseline
|
|
147
|
-
- Response body: `{ "visitorsPerMin": 2000, "status": "normal" }`
|
|
148
|
-
- Scheduler calculates next run at T+5min and updates database
|
|
149
|
-
|
|
150
|
-
**T+5min**: AI Planner analyzes the endpoint
|
|
151
|
-
- Reads last 24 hours of execution history
|
|
152
|
-
- Sees steady 2,000 visitors with 100% success rate
|
|
153
|
-
- AI decision: "Everything looks stable, no changes needed"
|
|
154
|
-
- No hints written to database
|
|
155
|
-
|
|
156
|
-
**T+10min**: Flash sale starts, traffic spikes
|
|
157
|
-
- Scheduler runs endpoint on 5-minute baseline
|
|
158
|
-
- Response body: `{ "visitorsPerMin": 5500, "status": "elevated" }`
|
|
159
|
-
- Scheduler records results and schedules next run at T+15min
|
|
160
|
-
|
|
161
|
-
**T+12min**: AI Planner analyzes again
|
|
162
|
-
- Sees visitor count jumped from 2,000 to 5,500
|
|
163
|
-
- Looks at trend over last few runs—increasing
|
|
164
|
-
- AI decision: "High load detected, need tighter monitoring"
|
|
165
|
-
- Writes interval hint to database: 30 seconds, expires in 60 minutes
|
|
166
|
-
- **Nudges** `nextRunAt` to T+12min+30sec
|
|
167
|
-
|
|
168
|
-
**T+12min+30sec**: Scheduler wakes up, claims endpoint (now due)
|
|
169
|
-
- Reads endpoint state from database
|
|
170
|
-
- Sees fresh AI hint (30-second interval, expires at T+72min)
|
|
171
|
-
- Governor chooses: AI hint (30 sec) overrides baseline (5 min)
|
|
172
|
-
- Executes endpoint, gets response: `{ "visitorsPerMin": 6200, "status": "high" }`
|
|
173
|
-
- Calculates next run: T+13min (30 seconds from now)
|
|
174
|
-
|
|
175
|
-
**T+13min through T+72min**: Runs every 30 seconds
|
|
176
|
-
- AI hint remains active
|
|
177
|
-
- Scheduler uses 30-second interval for every run
|
|
178
|
-
- System monitors flash sale closely
|
|
99
|
+
Manually pause an endpoint until a specific time. While paused, the endpoint won't run regardless of baseline or hints.
|
|
179
100
|
|
|
180
|
-
|
|
181
|
-
- Scheduler reads endpoint state
|
|
182
|
-
- No valid hints found (aiHintExpiresAt < now)
|
|
183
|
-
- Governor chooses: Baseline (5 min)
|
|
184
|
-
- System returns to normal 5-minute interval
|
|
185
|
-
- AI can propose new hints if load remains high
|
|
101
|
+
## The Governor
|
|
186
102
|
|
|
187
|
-
|
|
103
|
+
The **Governor** is a pure function inside the Scheduler that decides when a job runs next. After executing an endpoint, the Scheduler calls the Governor with current time, endpoint configuration, and constraints.
|
|
188
104
|
|
|
189
|
-
|
|
190
|
-
2. **Hints have TTLs**—Bad AI decisions auto-correct
|
|
191
|
-
3. **Nudging provides immediacy**—Changes take effect within seconds, not minutes
|
|
192
|
-
4. **Eventual consistency works**—There's a delay between analysis and application, but it's acceptable
|
|
193
|
-
5. **System self-heals**—When hints expire, it returns to known-good baseline
|
|
105
|
+
The Governor evaluates all scheduling information and returns: "Run this endpoint next at [timestamp]."
|
|
194
106
|
|
|
195
|
-
|
|
107
|
+
The Governor is deterministic—same inputs always produce the same output. It has no side effects and makes no database calls.
|
|
196
108
|
|
|
197
|
-
|
|
109
|
+
For detailed Governor logic, see [How Scheduling Works](./how-scheduling-works.md).
|
|
198
110
|
|
|
199
|
-
|
|
200
|
-
- Current time
|
|
201
|
-
- Endpoint configuration (baseline, hints, constraints)
|
|
202
|
-
- Cron parser (for cron expressions)
|
|
203
|
-
|
|
204
|
-
The Governor evaluates all scheduling information and returns a single answer: "Run this endpoint next at [timestamp]."
|
|
205
|
-
|
|
206
|
-
The Governor is deterministic—same inputs always produce the same output. It has no side effects, makes no database calls, and contains no business logic beyond "what time should this run next?"
|
|
207
|
-
|
|
208
|
-
This determinism makes the Governor:
|
|
209
|
-
- **Testable**: We can verify scheduling logic with unit tests
|
|
210
|
-
- **Auditable**: Every scheduling decision has a clear source ("baseline-cron", "ai-interval", etc.)
|
|
211
|
-
- **Debuggable**: You can trace why a job ran when it did
|
|
212
|
-
- **Portable**: The algorithm can be understood, documented, and reimplemented
|
|
213
|
-
|
|
214
|
-
The Governor's logic is covered in detail in [How Scheduling Works](./how-scheduling-works.md).
|
|
215
|
-
|
|
216
|
-
## Why This Architecture Works for Adaptation
|
|
217
|
-
|
|
218
|
-
Traditional cron systems are static—you set a schedule and it runs forever on that schedule. Cronicorn's architecture enables adaptive scheduling because:
|
|
219
|
-
|
|
220
|
-
1. **Separation allows continuous learning**: While the Scheduler executes jobs, the AI Planner can analyze patterns without disrupting execution. Analysis happens in parallel, not blocking execution.
|
|
221
|
-
|
|
222
|
-
2. **Hints enable safe experimentation**: Because hints have TTLs, the AI can try aggressive schedule changes knowing they'll auto-expire if wrong. This allows quick adaptation without risk.
|
|
223
|
-
|
|
224
|
-
3. **Database state captures context**: Every execution records response bodies. The AI can see the data returned by endpoints—not just success/failure, but real metrics like queue depths, error rates, latency. This rich context enables intelligent decisions.
|
|
225
|
-
|
|
226
|
-
4. **Override semantics enable tightening**: AI interval hints *override* baseline (not just compete), so the system can tighten monitoring during incidents. Without this override, the baseline would always win and adaptation would be limited to relaxation only.
|
|
227
|
-
|
|
228
|
-
5. **Independent scaling supports different workloads**: Execution workload (Scheduler) and analysis workload (AI Planner) have different characteristics. Separating them allows optimizing each independently.
|
|
229
|
-
|
|
230
|
-
## Data Flows: Putting It All Together
|
|
231
|
-
|
|
232
|
-
Here's how information flows through the system:
|
|
111
|
+
## Data Flow
|
|
233
112
|
|
|
234
113
|
```
|
|
235
114
|
[User Creates Endpoint]
|
|
@@ -252,47 +131,36 @@ Here's how information flows through the system:
|
|
|
252
131
|
↓
|
|
253
132
|
Governor sees hints, calculates next run
|
|
254
133
|
↓
|
|
255
|
-
Database (nextRunAt updated with hint influence)
|
|
256
|
-
↓
|
|
257
134
|
[Cycle continues...]
|
|
258
135
|
```
|
|
259
136
|
|
|
260
|
-
|
|
261
|
-
|
|
262
|
-
## Trade-offs and Design Decisions
|
|
137
|
+
## Trade-offs
|
|
263
138
|
|
|
264
|
-
|
|
265
|
-
|
|
266
|
-
**✅ Pros:**
|
|
139
|
+
**Pros:**
|
|
267
140
|
- Reliability (execution never blocked by AI)
|
|
268
141
|
- Performance (no AI in critical path)
|
|
269
142
|
- Scalability (independent worker scaling)
|
|
270
143
|
- Safety (AI bugs can't break execution)
|
|
271
144
|
- Testability (deterministic components)
|
|
272
145
|
|
|
273
|
-
|
|
274
|
-
- Eventual consistency (hints applied after next execution
|
|
146
|
+
**Cons:**
|
|
147
|
+
- Eventual consistency (hints applied after next execution)
|
|
275
148
|
- Database as bottleneck (all coordination through DB)
|
|
276
|
-
- More complex deployment (two worker types
|
|
277
|
-
- Debugging requires understanding async flows
|
|
149
|
+
- More complex deployment (two worker types)
|
|
278
150
|
|
|
279
|
-
|
|
151
|
+
The slight delay in applying AI hints (typically 5-30 seconds) is acceptable because scheduling adjustments aren't time-critical—we're optimizing for hours/days, not milliseconds.
|
|
280
152
|
|
|
281
153
|
## What You Need to Know as a User
|
|
282
154
|
|
|
283
|
-
|
|
284
|
-
|
|
285
|
-
|
|
286
|
-
|
|
287
|
-
2. **Response bodies matter**: The AI analyzes the JSON you return. Structure it to include metrics the AI should monitor (queue depths, error rates, status flags).
|
|
288
|
-
|
|
289
|
-
3. **Constraints are hard limits**: Min/max intervals and pause states override even AI hints. Use them to enforce invariants (rate limits, maintenance windows).
|
|
290
|
-
|
|
291
|
-
4. **Coordination happens via response bodies**: To orchestrate multiple endpoints, have them write coordination signals to their response bodies. Other endpoints can read these via the `get_sibling_latest_responses` tool.
|
|
292
|
-
|
|
293
|
-
5. **The system is eventually consistent**: Don't expect instant reactions to every change. The AI analyzes every 5 minutes, and hints apply on the next execution. Plan for minutes, not seconds.
|
|
294
|
-
|
|
155
|
+
1. **Your baseline schedule is your safety net**: The system returns to baseline when hints expire
|
|
156
|
+
2. **Response bodies matter**: The AI analyzes the JSON you return
|
|
157
|
+
3. **Constraints are hard limits**: Min/max intervals and pause states override AI hints
|
|
158
|
+
4. **The system is eventually consistent**: Plan for minutes, not seconds
|
|
295
159
|
|
|
296
160
|
---
|
|
297
161
|
|
|
298
|
-
|
|
162
|
+
## See Also
|
|
163
|
+
|
|
164
|
+
- **[How Scheduling Works](./how-scheduling-works.md)** - Detailed Governor logic and safety mechanisms
|
|
165
|
+
- **[How AI Adaptation Works](./how-ai-adaptation-works.md)** - AI tools, response body design, and decision framework
|
|
166
|
+
- **[Configuration and Constraints](./configuration-and-constraints.md)** - Setting up endpoints effectively
|