mongodash 2.1.0 → 2.1.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +70 -4
- package/dist/dashboard/index.html +8 -8
- package/dist/lib/playground/server.js +1 -8
- package/dist/lib/src/ConcurrentRunner.js +1 -8
- package/dist/lib/src/OnError.js +1 -8
- package/dist/lib/src/OnInfo.js +1 -8
- package/dist/lib/src/createContinuousLock.js +1 -8
- package/dist/lib/src/cronTasks.js +1 -8
- package/dist/lib/src/getCollection.js +1 -8
- package/dist/lib/src/getMongoClient.js +1 -8
- package/dist/lib/src/globalsCollection.js +1 -8
- package/dist/lib/src/index.js +1 -8
- package/dist/lib/src/initPromise.js +1 -8
- package/dist/lib/src/mongoCompatibility.js +1 -8
- package/dist/lib/src/parseInterval.js +1 -8
- package/dist/lib/src/prefixFilterKeys.js +1 -8
- package/dist/lib/src/processInBatches.js +1 -8
- package/dist/lib/src/reactiveTasks/LeaderElector.js +1 -8
- package/dist/lib/src/reactiveTasks/MetricsCollector.js +1 -8
- package/dist/lib/src/reactiveTasks/ReactiveTaskManager.js +1 -8
- package/dist/lib/src/reactiveTasks/ReactiveTaskOps.js +1 -8
- package/dist/lib/src/reactiveTasks/ReactiveTaskPlanner.js +1 -8
- package/dist/lib/src/reactiveTasks/ReactiveTaskReconciler.js +1 -8
- package/dist/lib/src/reactiveTasks/ReactiveTaskRegistry.js +1 -8
- package/dist/lib/src/reactiveTasks/ReactiveTaskRepository.js +1 -8
- package/dist/lib/src/reactiveTasks/ReactiveTaskRetryStrategy.js +1 -8
- package/dist/lib/src/reactiveTasks/ReactiveTaskTypes.js +1 -8
- package/dist/lib/src/reactiveTasks/ReactiveTaskWorker.js +1 -8
- package/dist/lib/src/reactiveTasks/compileWatchProjection.js +1 -8
- package/dist/lib/src/reactiveTasks/index.js +1 -8
- package/dist/lib/src/reactiveTasks/queryToExpression.js +1 -8
- package/dist/lib/src/reactiveTasks/validateTaskFilter.js +1 -8
- package/dist/lib/src/task-management/OperationalTaskController.js +1 -8
- package/dist/lib/src/task-management/index.js +1 -8
- package/dist/lib/src/task-management/serveDashboard.js +1 -8
- package/dist/lib/src/task-management/types.js +1 -8
- package/dist/lib/src/withLock.js +1 -8
- package/dist/lib/src/withTransaction.js +1 -8
- package/dist/lib/tools/check-db-connection.js +1 -8
- package/dist/lib/tools/clean-testing-databases.js +1 -8
- package/dist/lib/tools/prepare-republish.js +1 -8
- package/dist/lib/tools/test-matrix-local.js +1 -8
- package/dist/lib/tools/testingDatabase.js +1 -8
- package/docs/.vitepress/cache/deps/_metadata.json +6 -6
- package/docs/.vitepress/config.mts +2 -0
- package/docs/.vitepress/theme/style.css +5 -0
- package/docs/getting-started.md +68 -2
- package/docs/index.md +3 -0
- package/docs/public/logo.png +0 -0
- package/docs/reactive-tasks.md +402 -403
- package/package.json +5 -1
package/docs/reactive-tasks.md
CHANGED
|
@@ -4,45 +4,29 @@ A powerful, distributed task execution system built on top of [MongoDB Change St
|
|
|
4
4
|
|
|
5
5
|
Reactive Tasks allow you to define background jobs that trigger automatically when your data changes. This enables **Model Data-Driven Flows**, where business logic is triggered by state changes (e.g., `status: 'paid'`) rather than explicit calls. The system handles **concurrency**, **retries**, **deduplication**, and **monitoring** out of the box.
|
|
6
6
|
|
|
7
|
-
##
|
|
7
|
+
## Overview
|
|
8
|
+
|
|
9
|
+
### Features
|
|
8
10
|
|
|
9
11
|
- **Reactive**: Tasks triggered instantly (near-real-time) by database changes (insert/update).
|
|
10
12
|
- **Distributed**: Safe to run on multiple instances (Kubernetes/Serverless). Only one instance processes a specific task for a specific document at a time.
|
|
11
13
|
- **Efficient Listener**: Regardless of the number of application instances, **only one instance (the leader)** listens to the MongoDB Change Stream. This minimizes database load significantly (O(1) connections), though it implies that the total ingestion throughput is limited by the single leader instance.
|
|
12
14
|
- **Reliable**: Built-in retry mechanisms (exponential backoff) and "Dead Letter Queue" logic.
|
|
13
15
|
- **Efficient**: Uses MongoDB Driver for low-latency updates and avoids polling where possible.
|
|
14
|
-
- **Memory Efficiency**: The system is designed to handle large datasets. During live scheduling (Change Streams), reconciliation, and periodic cleanup, the library only loads the `_id`'s of the source documents into memory, keeping the footprint low regardless of the collection size. Note that task *storage* size depends on your `watchProjection` configuration—see [Storage Optimization](#change-detection
|
|
16
|
+
- **Memory Efficiency**: The system is designed to handle large datasets. During live scheduling (Change Streams), reconciliation, and periodic cleanup, the library only loads the `_id`'s of the source documents into memory, keeping the footprint low regardless of the collection size. Note that task *storage* size depends on your `watchProjection` configuration—see [Storage Optimization](#change-detection-and-storage-optimization).
|
|
15
17
|
- **Observability**: First-class Prometheus metrics support.
|
|
16
18
|
- **Dashboard**: A visual [Dashboard](./dashboard.md) to monitor, retry, and debug tasks.
|
|
17
19
|
|
|
18
|
-
|
|
19
|
-
|
|
20
|
-
The system uses a **Leader-Worker** architecture to balance efficiency and scalability.
|
|
21
|
-
|
|
22
|
-
### 1. The Leader (Planner)
|
|
23
|
-
- **Role**: A single instance is elected as the **Leader**.
|
|
24
|
-
- **Responsibility**: It listens to the MongoDB Change Stream, calculates the necessary tasks (based on `watchProjection`), and persists them into the `_tasks` collection. To minimize memory usage, it only fetches the document `_id` from the Change Stream event.
|
|
25
|
-
> [!NOTE]
|
|
26
|
-
> **Database Resolution**: The Change Stream is established on the database of the **first registered reactive task**.
|
|
27
|
-
- **Resilience**: Leadership is maintained via a distributed lock with a heartbeat. If the leader crashes, another instance automatically takes over (Failover).
|
|
28
|
-
|
|
29
|
-
### 2. The Workers (Executors)
|
|
30
|
-
- **Role**: *Every* instance (including the leader) runs a set of **Worker** threads (managed by the event loop).
|
|
31
|
-
- **Responsibility**: Workers poll the `_tasks` collection for `pending` jobs, lock them, and execute the `handler`.
|
|
32
|
-
- **Adaptive Polling**: Workers use an **adaptive polling** mechanism.
|
|
33
|
-
- **Idle**: If no tasks are found, the polling frequency automatically lowers (saves CPU/IO).
|
|
34
|
-
- **Busy**: If tasks are found (or the **local** Leader signals new work), the frequency speeds up immediately to process the queue as fast as possible. Workers on other instances will speed up once they independently find a task during their regular polling.
|
|
35
|
-
|
|
36
|
-
## Reactive vs Scheduled Tasks
|
|
20
|
+
### Reactive vs Scheduled Tasks
|
|
37
21
|
|
|
38
22
|
It is important to distinguish between Reactive Tasks and standard schedulers (like Agenda or BullMQ).
|
|
39
23
|
|
|
40
24
|
- **Reactive Tasks (Reactors)**: Triggered by **state changes** (data). "When Order is Paid, send email". This guarantees consistency with data.
|
|
41
25
|
- **Schedulers**: Triggered by **time**. "Send email at 2:00 PM".
|
|
42
26
|
|
|
43
|
-
Reactive Tasks support time-based operations via `
|
|
27
|
+
Reactive Tasks support time-based operations via `debounce` (e.g., "Wait 1m after data change to settle") and `deferCurrent` (e.g., "Retry in 5m"), but they are fundamentally event-driven. If you need purely time-based jobs (e.g., "Daily Report" without any data change trigger), you can trigger them via a [Cron job](./cron-tasks.md), although you can model them as "Run on insert to 'daily_reports' collection".
|
|
44
28
|
|
|
45
|
-
|
|
29
|
+
### Advantages over Standard Messaging
|
|
46
30
|
|
|
47
31
|
Using Reactive Tasks instead of a traditional message broker (RabbitMQ, Kafka) provides distinct architectural benefits:
|
|
48
32
|
|
|
@@ -58,7 +42,6 @@ Using Reactive Tasks instead of a traditional message broker (RabbitMQ, Kafka) p
|
|
|
58
42
|
- The task queue is stored in a standard MongoDB collection (`[collection]_tasks`), not in a hidden broker queue.
|
|
59
43
|
- You can use standard tools (MongoDB Compass, Atlas Data Explorer, simple queries) to inspect pending jobs, debug failures, and analyze queue distribution without needing specialized queue management interfaces.
|
|
60
44
|
|
|
61
|
-
|
|
62
45
|
## Getting Started
|
|
63
46
|
|
|
64
47
|
### 1. Initialization
|
|
@@ -113,74 +96,7 @@ import { startReactiveTasks } from 'mongodash';
|
|
|
113
96
|
await startReactiveTasks();
|
|
114
97
|
```
|
|
115
98
|
|
|
116
|
-
### 4.
|
|
117
|
-
|
|
118
|
-
Reactive Tasks are versatile. Here are a few patterns you can implement:
|
|
119
|
-
|
|
120
|
-
#### A. Webhook Delivery & Data Sync
|
|
121
|
-
Perfect for reliable delivery of data to external systems. If the external API is down, Mongodash will automatically retry with exponential backoff.
|
|
122
|
-
|
|
123
|
-
```typescript
|
|
124
|
-
await reactiveTask({
|
|
125
|
-
task: 'sync-order-to-erp',
|
|
126
|
-
collection: 'orders',
|
|
127
|
-
filter: { status: 'paid' }, // Only sync when paid
|
|
128
|
-
watchProjection: { status: 1 }, // Only check when status changes
|
|
129
|
-
|
|
130
|
-
handler: async (context) => {
|
|
131
|
-
const order = await context.getDocument();
|
|
132
|
-
await axios.post('https://erp-system.com/api/orders', order);
|
|
133
|
-
}
|
|
134
|
-
});
|
|
135
|
-
```
|
|
136
|
-
|
|
137
|
-
#### B. Async Statistics Recalculation
|
|
138
|
-
Offload heavy calculations from the main request path. When a raw document changes, update the aggregated view in the background.
|
|
139
|
-
|
|
140
|
-
```typescript
|
|
141
|
-
await reactiveTask({
|
|
142
|
-
task: 'recalc-product-rating',
|
|
143
|
-
collection: 'reviews',
|
|
144
|
-
debounce: '5s', // Re-calc at most once every 5 seconds per product
|
|
145
|
-
|
|
146
|
-
handler: async (context) => {
|
|
147
|
-
// We only watched 'status', so we might need the full doc?
|
|
148
|
-
// Or if we have the ID, that's enough for aggregation:
|
|
149
|
-
const { docId } = context;
|
|
150
|
-
|
|
151
|
-
// Calculate new average
|
|
152
|
-
const stats = await calculateAverageRating(docId);
|
|
153
|
-
|
|
154
|
-
// Update product document
|
|
155
|
-
await db.collection('products').updateOne(
|
|
156
|
-
{ _id: docId },
|
|
157
|
-
{ $set: { rating: stats.rating, reviewCount: stats.count } }
|
|
158
|
-
);
|
|
159
|
-
}
|
|
160
|
-
});
|
|
161
|
-
```
|
|
162
|
-
|
|
163
|
-
#### C. Pub-Sub (Event Bus)
|
|
164
|
-
Use Reactive Tasks as a distributed Event Bus. By creating an events collection and watching only the `_id`, you effectively create a listener that triggers **only on new insertions**.
|
|
165
|
-
|
|
166
|
-
```typescript
|
|
167
|
-
await reactiveTask({
|
|
168
|
-
task: 'send-welcome-sequence',
|
|
169
|
-
collection: 'app_events',
|
|
170
|
-
|
|
171
|
-
// TRICK: _id never changes.
|
|
172
|
-
// This config ensures the handler ONLY runs when a new document is inserted.
|
|
173
|
-
watchProjection: { _id: 1 },
|
|
174
|
-
filter: { type: 'user-registered' },
|
|
175
|
-
|
|
176
|
-
handler: async (context) => {
|
|
177
|
-
const event = await context.getDocument();
|
|
178
|
-
await emailService.sendWelcome(event.payload.email);
|
|
179
|
-
}
|
|
180
|
-
});
|
|
181
|
-
```
|
|
182
|
-
|
|
183
|
-
### 5. Advanced Initialization
|
|
99
|
+
### 4. Advanced Configuration
|
|
184
100
|
|
|
185
101
|
You can customize the scheduler behavior via `mongodash.init`:
|
|
186
102
|
|
|
@@ -239,6 +155,8 @@ await mongodash.init({
|
|
|
239
155
|
});
|
|
240
156
|
```
|
|
241
157
|
|
|
158
|
+
## Writing Tasks
|
|
159
|
+
|
|
242
160
|
### Task Options
|
|
243
161
|
|
|
244
162
|
| Option | Type | Description |
|
|
@@ -247,35 +165,81 @@ await mongodash.init({
|
|
|
247
165
|
| `collection` | `string` | **Required**. Name of the MongoDB collection to watch. |
|
|
248
166
|
| `handler` | `(context) => Promise<void>` | **Required**. Async function to process the task. Use `context.getDocument()` to get the document. |
|
|
249
167
|
| `filter` | `Document` | Standard Query (e.g., `{ status: 'pending' }`) OR Aggregation Expression (e.g., `{ $eq: ['$status', 'pending'] }`). Aggregation syntax unlocks powerful features like using `$$NOW` for time-based filtering. |
|
|
250
|
-
| `watchProjection` | `Document` | MongoDB Projection. Task only triggers if the projected result changes. Supports inclusion `{ a: 1 }` and computed fields. |
|
|
168
|
+
| `watchProjection` | `Document` | MongoDB Projection. Task only re-triggers if the projected result changes. Supports inclusion `{ a: 1 }` and computed fields. |
|
|
251
169
|
| `debounce` | `number \| string` | Debounce window (ms or duration string). Default: `1000`. Useful to group rapid updates. |
|
|
252
170
|
| `retryPolicy` | `RetryPolicy` | Configuration for retries on failure. |
|
|
253
171
|
| `cleanupPolicy` | `CleanupPolicy` | Configuration for automatic cleanup of orphaned task records. See [Cleanup Policy](#cleanup-policy). |
|
|
254
172
|
| `executionHistoryLimit` | `number` | Number of past execution entries to keep in `_tasks` doc. Default: `5`. |
|
|
255
|
-
| `evolution` | `EvolutionConfig` | Configuration for handling task logic updates (versioning, reconciliation policies). |
|
|
173
|
+
| `evolution` | `EvolutionConfig` | Configuration for handling task logic updates (versioning, reconciliation policies). See [Filter Evolution & Reconciliation](#filter-evolution-and-reconciliation). |
|
|
256
174
|
|
|
257
|
-
###
|
|
175
|
+
### Common Use Cases
|
|
258
176
|
|
|
259
|
-
|
|
177
|
+
Reactive Tasks are versatile. Here are a few patterns you can implement:
|
|
260
178
|
|
|
261
|
-
|
|
262
|
-
|
|
263
|
-
2. **Snapshotting**: This task document holds a snapshot of the source document's fields (specifically, the result of `watchProjection`).
|
|
264
|
-
3. **Diffing**: When an event occurs (or during reconciliation), the system compares the current state of the document against the stored snapshot (`lastObservedValues`).
|
|
265
|
-
4. **No-Op**: If the watched fields haven't changed, **no task is triggered**. This guarantees reliability and prevents redundant processing.
|
|
179
|
+
#### A. Webhook Delivery & Data Sync
|
|
180
|
+
Perfect for reliable delivery of data to external systems. If the external API is down, Mongodash will automatically retry with exponential backoff.
|
|
266
181
|
|
|
267
|
-
|
|
268
|
-
|
|
269
|
-
|
|
270
|
-
|
|
271
|
-
|
|
272
|
-
|
|
273
|
-
|
|
274
|
-
|
|
275
|
-
|
|
276
|
-
|
|
182
|
+
```typescript
|
|
183
|
+
await reactiveTask({
|
|
184
|
+
task: 'sync-order-to-erp',
|
|
185
|
+
collection: 'orders',
|
|
186
|
+
filter: { status: 'paid' }, // Only sync when paid
|
|
187
|
+
watchProjection: { status: 1 }, // Only check when status changes
|
|
188
|
+
|
|
189
|
+
handler: async (context) => {
|
|
190
|
+
const order = await context.getDocument();
|
|
191
|
+
await axios.post('https://erp-system.com/api/orders', order);
|
|
192
|
+
}
|
|
193
|
+
});
|
|
194
|
+
```
|
|
195
|
+
|
|
196
|
+
#### B. Async Statistics Recalculation
|
|
197
|
+
Offload heavy calculations from the main request path. When a raw document changes, update the aggregated view in the background.
|
|
198
|
+
|
|
199
|
+
```typescript
|
|
200
|
+
await reactiveTask({
|
|
201
|
+
task: 'recalc-product-rating',
|
|
202
|
+
collection: 'reviews',
|
|
203
|
+
debounce: '5s', // Re-calc at most once every 5 seconds per product
|
|
204
|
+
|
|
205
|
+
handler: async (context) => {
|
|
206
|
+
// We only watched 'status', so we might need the full doc?
|
|
207
|
+
// Or if we have the ID, that's enough for aggregation:
|
|
208
|
+
const { docId } = context;
|
|
209
|
+
|
|
210
|
+
// Calculate new average
|
|
211
|
+
const stats = await calculateAverageRating(docId);
|
|
212
|
+
|
|
213
|
+
// Update product document
|
|
214
|
+
await db.collection('products').updateOne(
|
|
215
|
+
{ _id: docId },
|
|
216
|
+
{ $set: { rating: stats.rating, reviewCount: stats.count } }
|
|
217
|
+
);
|
|
218
|
+
}
|
|
219
|
+
});
|
|
220
|
+
```
|
|
221
|
+
|
|
222
|
+
#### C. Pub-Sub (Event Bus)
|
|
223
|
+
Use Reactive Tasks as a distributed Event Bus. By creating an events collection and watching only the `_id`, you effectively create a listener that triggers **only on new insertions**.
|
|
224
|
+
|
|
225
|
+
```typescript
|
|
226
|
+
await reactiveTask({
|
|
227
|
+
task: 'send-welcome-sequence',
|
|
228
|
+
collection: 'app_events',
|
|
229
|
+
|
|
230
|
+
// TRICK: _id never changes.
|
|
231
|
+
// This config ensures the handler ONLY runs when a new document is inserted.
|
|
232
|
+
watchProjection: { _id: 1 },
|
|
233
|
+
filter: { type: 'user-registered' },
|
|
234
|
+
|
|
235
|
+
handler: async (context) => {
|
|
236
|
+
const event = await context.getDocument();
|
|
237
|
+
await emailService.sendWelcome(event.payload.email);
|
|
238
|
+
}
|
|
239
|
+
});
|
|
240
|
+
```
|
|
277
241
|
|
|
278
|
-
###
|
|
242
|
+
### The Handler Context: `getDocument` & Safety Checks
|
|
279
243
|
|
|
280
244
|
Critically, the library performs a **runtime check** when you call `await context.getDocument()` inside your handler.
|
|
281
245
|
|
|
@@ -342,142 +306,132 @@ handler: async (context) => {
|
|
|
342
306
|
}
|
|
343
307
|
```
|
|
344
308
|
|
|
309
|
+
### Flow Control (Defer / Throttle)
|
|
345
310
|
|
|
311
|
+
Sometimes you need dynamic control over task execution speed based on external factors (e.g., rate limits of a 3rd party API) or business logic.
|
|
346
312
|
|
|
347
|
-
|
|
348
|
-
|
|
349
|
-
The Cleanup Policy controls automatic deletion of orphaned task records — tasks whose source documents have been deleted or no longer match the configured filter.
|
|
350
|
-
|
|
351
|
-
#### Configuration
|
|
352
|
-
|
|
353
|
-
```typescript
|
|
354
|
-
cleanupPolicy?: {
|
|
355
|
-
deleteWhen?: 'sourceDocumentDeleted' | 'sourceDocumentDeletedOrNoLongerMatching' | 'never';
|
|
356
|
-
keepFor?: string | number;
|
|
357
|
-
}
|
|
358
|
-
```
|
|
359
|
-
|
|
360
|
-
| Property | Type | Default | Description |
|
|
361
|
-
|----------|------|---------|-------------|
|
|
362
|
-
| `deleteWhen` | `string` | `'sourceDocumentDeleted'` | When to trigger task deletion |
|
|
363
|
-
| `keepFor` | `string \| number` | `'24h'` | Grace period before deletion (e.g., `'1h'`, `'7d'`, or `86400000` ms) |
|
|
364
|
-
|
|
365
|
-
#### Deletion Strategies (`deleteWhen`)
|
|
366
|
-
|
|
367
|
-
| Strategy | Behavior |
|
|
368
|
-
|----------|----------|
|
|
369
|
-
| `sourceDocumentDeleted` | **Default.** Task deleted only when its source document is deleted from the database. Filter mismatches are ignored. |
|
|
370
|
-
| `sourceDocumentDeletedOrNoLongerMatching` | Task deleted when source document is deleted **OR** when it no longer matches the task's `filter`. Useful for cases the change of document is permament and it is not expected the document could match in the future again and retrigger because of that. Also useful for `$$NOW`-based or dynamic filters. |
|
|
371
|
-
| `never` | Tasks are never automatically deleted. Use for audit trails or manual cleanup scenarios. |
|
|
372
|
-
|
|
373
|
-
#### Grace Period Calculation
|
|
374
|
-
|
|
375
|
-
The `keepFor` grace period is measured from `MAX(updatedAt, lastFinalizedAt)`:
|
|
313
|
+
The `handler` receives a `context` object that exposes flow control methods.
|
|
376
314
|
|
|
377
|
-
|
|
378
|
-
- **`lastFinalizedAt`**: When a worker last completed or failed the task
|
|
315
|
+
#### 1. Deferral (`deferCurrent`)
|
|
379
316
|
|
|
380
|
-
|
|
381
|
-
1. The source data changed recently, OR
|
|
382
|
-
2. A worker processed the task recently
|
|
317
|
+
Delays the **current** task execution. The task is put back into the queue specifically for this document and will not be picked up again until the specified time.
|
|
383
318
|
|
|
384
|
-
|
|
319
|
+
This is useful for:
|
|
320
|
+
* **Rate Limits**: "API returned 429, try again in 30 seconds."
|
|
321
|
+
* **Business Waits**: "Customer created, but wait 1 hour before sending first email."
|
|
385
322
|
|
|
386
323
|
```typescript
|
|
387
324
|
await reactiveTask({
|
|
388
|
-
task: '
|
|
389
|
-
collection: '
|
|
390
|
-
|
|
391
|
-
|
|
392
|
-
|
|
393
|
-
|
|
394
|
-
|
|
395
|
-
|
|
396
|
-
|
|
397
|
-
|
|
398
|
-
|
|
325
|
+
task: 'send-webhook',
|
|
326
|
+
collection: 'events',
|
|
327
|
+
handler: async (context) => {
|
|
328
|
+
const doc = await context.getDocument();
|
|
329
|
+
try {
|
|
330
|
+
await sendWebhook(doc);
|
|
331
|
+
} catch (err) {
|
|
332
|
+
if (err.status === 429) {
|
|
333
|
+
const retryAfter = err.headers['retry-after'] || 30; // seconds
|
|
334
|
+
|
|
335
|
+
// Defer THIS task only.
|
|
336
|
+
// It resets status to 'pending' and schedules it for future.
|
|
337
|
+
// It does NOT increment attempt count (it's not a failure).
|
|
338
|
+
context.deferCurrent(retryAfter * 1000);
|
|
339
|
+
return;
|
|
340
|
+
}
|
|
341
|
+
throw err; // Use standard retry policy for other errors
|
|
342
|
+
}
|
|
343
|
+
}
|
|
399
344
|
});
|
|
400
345
|
```
|
|
401
346
|
|
|
402
|
-
####
|
|
347
|
+
#### 2. Throttling (`throttleAll`)
|
|
403
348
|
|
|
404
|
-
|
|
349
|
+
Pauses all FUTURE tasks of this type for a specified duration. This serves as a "Circuit Breaker" when an external system (e.g., CRM, Payment Gateway) is unresponsive or returns overload errors (503, 429).
|
|
405
350
|
|
|
406
351
|
```typescript
|
|
407
|
-
|
|
408
|
-
reactiveTaskCleanupInterval: '12h', // Run cleanup every 12 hours (default: '24h')
|
|
409
|
-
});
|
|
352
|
+
context.throttleAll(60 * 1000); // Pause this task type for 1 minute
|
|
410
353
|
```
|
|
411
354
|
|
|
412
|
-
|
|
413
|
-
|
|
414
|
-
|
|
415
|
-
|
|
355
|
+
> [!IMPORTANT]
|
|
356
|
+
> **Cluster Behavior (Instance-Local)**
|
|
357
|
+
> `throttleAll` operates only in the memory of the current instance (worker).
|
|
358
|
+
> In a distributed environment (e.g., Kubernetes with multiple pods), other instances will not know about the issue immediately. They will continue processing until they independently encounter the error and trigger their own `throttleAll`.
|
|
359
|
+
>
|
|
360
|
+
> **Result**: The load on the external service will not drop to zero immediately but will decrease gradually as individual instances hit the "circuit breaker".
|
|
416
361
|
|
|
362
|
+
> [!NOTE]
|
|
363
|
+
> **Current Task**
|
|
364
|
+
> `throttleAll` does not affect the currently running task. If you want to postpone the current task (so it counts as pending and retries after the pause), you must explicitly call `deferCurrent()`.
|
|
417
365
|
|
|
418
|
-
|
|
419
|
-
|
|
420
|
-
Reactive Tasks are designed to evolve with your application. As you deploy new versions of your code, you might change the `filter`, `watchProjection`, or the `handler` logic itself. The system automatically detects these changes and adapts the task state accordingly.
|
|
421
|
-
|
|
422
|
-
You can control this behavior using the optional `evolution` configuration:
|
|
366
|
+
**Example (Service Down):**
|
|
423
367
|
|
|
424
368
|
```typescript
|
|
425
369
|
await reactiveTask({
|
|
426
|
-
task: '
|
|
427
|
-
collection: '
|
|
428
|
-
|
|
429
|
-
|
|
430
|
-
|
|
431
|
-
|
|
432
|
-
|
|
433
|
-
|
|
434
|
-
|
|
435
|
-
|
|
436
|
-
|
|
437
|
-
// - 'reprocess_failed': Reset all 'failed' tasks to 'pending' to retry with new code.
|
|
438
|
-
// - 'reprocess_all': Reset ALL tasks (even completed ones) to 'pending'.
|
|
439
|
-
onHandlerVersionChange: 'reprocess_failed',
|
|
440
|
-
|
|
441
|
-
// If 'filter' or 'watchProjection' changes, should we run reconciliation?
|
|
442
|
-
// Default: true
|
|
443
|
-
reconcileOnTriggerChange: true
|
|
444
|
-
},
|
|
370
|
+
task: 'sync-to-crm',
|
|
371
|
+
collection: 'users',
|
|
372
|
+
handler: async (context) => {
|
|
373
|
+
// Note: You can throttle even before fetching the doc if you know the service is down!
|
|
374
|
+
try {
|
|
375
|
+
const doc = await context.getDocument();
|
|
376
|
+
await crmApi.update(doc);
|
|
377
|
+
} catch (err) {
|
|
378
|
+
// If service is unavailable (503) or circuit breaker is open
|
|
379
|
+
if (err.status === 503 || err.isCircuitBreakerOpen) {
|
|
380
|
+
console.warn('CRM is down, pausing tasks for 1 minute.');
|
|
445
381
|
|
|
446
|
-
|
|
382
|
+
// 1. Stop processing future tasks of this type on this instance
|
|
383
|
+
context.throttleAll(60 * 1000);
|
|
384
|
+
|
|
385
|
+
// 2. Defer the CURRENT task so it retries after the pause
|
|
386
|
+
context.deferCurrent(60 * 1000);
|
|
387
|
+
return;
|
|
388
|
+
}
|
|
389
|
+
throw err; // Standard retry policy for other errors
|
|
390
|
+
}
|
|
391
|
+
}
|
|
447
392
|
});
|
|
448
393
|
```
|
|
449
394
|
|
|
450
|
-
|
|
395
|
+
### Idempotency & Re-execution
|
|
451
396
|
|
|
452
|
-
|
|
397
|
+
The system is designed with an **At-Least-Once** execution guarantee. This is a fundamental property of distributed systems that value reliability over "exactly-once".
|
|
453
398
|
|
|
454
|
-
|
|
455
|
-
* **Pending Tasks**: Workers will pick up pending tasks. Before execution, they perform a "Late-Binding Check". If the document no longer matches the new filter (e.g. amount is 75), the task is **skipped** (completed) without running the handler.
|
|
456
|
-
* **Existing Tasks**: Tasks for documents that no longer match are **not deleted** immediately; they remain as history but won't satisfy the filter for future updates. See the cleanup policies for more details.
|
|
399
|
+
While the system strives to execute your handler exactly once per event, there are specific scenarios where it might execute multiple times for the same document state. Therefore, **your `handler` must be idempotent**.
|
|
457
400
|
|
|
458
|
-
|
|
459
|
-
* **Reconciliation**: The system detects the filter change and automatically triggers a **Reconciliation** scan for this specific task.
|
|
460
|
-
* **Backfilling**: It scans the collection for documents that *now* match the new filter (e.g. amount 75) but don't have a task yet. It schedules new tasks for them immediately.
|
|
461
|
-
* *Note*: This ensures specific newly-matched documents get processed without needing a manual migration script.
|
|
401
|
+
#### Common Re-execution Scenarios
|
|
462
402
|
|
|
463
|
-
|
|
464
|
-
|
|
403
|
+
1. **Transient Failures (Retries)**: If a worker crashes or loses network connectivity during execution (before marking the task `completed`), the lock will expire. Another worker will pick up the task and retry it.
|
|
404
|
+
2. **Reconciliation Recovery**: If task records are deleted (e.g. manual cleanup) but source documents remain, once a reconciliation runs, it recreates them as `pending`.
|
|
405
|
+
3. **Filter Re-matching** If a document is no longer matching the task filter, the task is deleted because the **sourceDocumentDeletedOrNoLongerMatching** cleanup policy is used and then the document is changed back again to match the task filter, the task will be recreated as `pending`.
|
|
406
|
+
4. **Explicit Reprocessing**: You might trigger re-execution manually (via `retryReactiveTasks`) or through schema evolution policies (`reprocess_all`).
|
|
465
407
|
|
|
466
|
-
####
|
|
408
|
+
#### Designing Idempotent Handlers
|
|
467
409
|
|
|
468
|
-
|
|
410
|
+
Ensure your handler allows multiple executions without adverse side effects.
|
|
469
411
|
|
|
470
|
-
|
|
471
|
-
|
|
472
|
-
|
|
473
|
-
|
|
474
|
-
|
|
412
|
+
**Example**:
|
|
413
|
+
```typescript
|
|
414
|
+
handler: async (context) => {
|
|
415
|
+
// 1. Fetch document (with verification)
|
|
416
|
+
const order = await context.getDocument();
|
|
475
417
|
|
|
476
|
-
|
|
477
|
-
|
|
418
|
+
// 2. Check if the work is already done
|
|
419
|
+
if (order.emailSent) return;
|
|
420
|
+
|
|
421
|
+
// 3. Perform the side-effect
|
|
422
|
+
await sendEmail(order.userId, "Order Received");
|
|
423
|
+
|
|
424
|
+
// 4. Mark as done (using atomic update)
|
|
425
|
+
await db.collection('orders').updateOne(
|
|
426
|
+
{ _id: order._id },
|
|
427
|
+
{ $set: { emailSent: true } }
|
|
428
|
+
);
|
|
429
|
+
}
|
|
430
|
+
```
|
|
478
431
|
|
|
479
|
-
|
|
432
|
+
## Policies & Lifecycle
|
|
480
433
|
|
|
434
|
+
### Retry Policy
|
|
481
435
|
|
|
482
436
|
You can configure the retry behavior using the `retryPolicy` option.
|
|
483
437
|
|
|
@@ -487,7 +441,7 @@ You can configure the retry behavior using the `retryPolicy` option.
|
|
|
487
441
|
| :--- | :--- | :--- | :--- |
|
|
488
442
|
| `type` | `string` | **Required** | `'fixed'`, `'linear'`, `'exponential'`, `'series'`, or `'cron'` |
|
|
489
443
|
| `maxAttempts` | `number` | `5`* | Maximum total attempts (use `-1` for unlimited). |
|
|
490
|
-
| `maxDuration` | `string \| number` | `undefined` | Stop retrying if elapsed time exceeds this value. |
|
|
444
|
+
| `maxDuration` | `string \| number` | `undefined` | Stop retrying if elapsed time since the **first failure** in the current sequence exceeds this value. |
|
|
491
445
|
| `resetRetriesOnDataChange` | `boolean` | `true` | Reset attempt count if the source document changes. |
|
|
492
446
|
|
|
493
447
|
*\* If `maxDuration` is specified, `maxAttempts` defaults to unlimited.*
|
|
@@ -504,7 +458,7 @@ You can configure the retry behavior using the `retryPolicy` option.
|
|
|
504
458
|
| **`series`** | `intervals` | - | Array of fixed delays (e.g., `['1m', '5m', '15m']`). |
|
|
505
459
|
| **`cron`** | `expression` | - | Standard cron string for scheduling retries. |
|
|
506
460
|
|
|
507
|
-
|
|
461
|
+
#### Examples
|
|
508
462
|
|
|
509
463
|
```typescript
|
|
510
464
|
// 1. Give up after 24 hours (infinite attempts within that window)
|
|
@@ -538,95 +492,139 @@ retryPolicy: {
|
|
|
538
492
|
}
|
|
539
493
|
```
|
|
540
494
|
|
|
541
|
-
###
|
|
495
|
+
### Cleanup Policy
|
|
542
496
|
|
|
543
|
-
|
|
497
|
+
The Cleanup Policy controls automatic deletion of orphaned task records — tasks whose source documents have been deleted or no longer match the configured filter.
|
|
544
498
|
|
|
545
|
-
|
|
499
|
+
#### Configuration
|
|
546
500
|
|
|
547
|
-
|
|
501
|
+
```typescript
|
|
502
|
+
cleanupPolicy?: {
|
|
503
|
+
deleteWhen?: 'sourceDocumentDeleted' | 'sourceDocumentDeletedOrNoLongerMatching' | 'never';
|
|
504
|
+
keepFor?: string | number;
|
|
505
|
+
}
|
|
506
|
+
```
|
|
548
507
|
|
|
549
|
-
|
|
508
|
+
| Property | Type | Default | Description |
|
|
509
|
+
|----------|------|---------|-------------|
|
|
510
|
+
| `deleteWhen` | `string` | `'sourceDocumentDeleted'` | When to trigger task deletion |
|
|
511
|
+
| `keepFor` | `string \| number` | `'24h'` | Grace period before deletion (e.g., `'1h'`, `'7d'`, or `86400000` ms) |
|
|
550
512
|
|
|
551
|
-
|
|
552
|
-
|
|
553
|
-
|
|
513
|
+
#### Deletion Strategies (`deleteWhen`)
|
|
514
|
+
|
|
515
|
+
| Strategy | Behavior |
|
|
516
|
+
|----------|----------|
|
|
517
|
+
| `sourceDocumentDeleted` | **Default.** Task deleted only when its source document is deleted from the database. Filter mismatches are ignored. |
|
|
518
|
+
| `sourceDocumentDeletedOrNoLongerMatching` | Task deleted when source document is deleted **OR** when it no longer matches the task's `filter`. Useful for cases the change of document is permament and it is not expected the document could match in the future again and retrigger because of that. Also useful for `$$NOW`-based or dynamic filters. |
|
|
519
|
+
| `never` | Tasks are never automatically deleted. Use for audit trails or manual cleanup scenarios. |
|
|
520
|
+
|
|
521
|
+
#### Grace Period Calculation
|
|
522
|
+
|
|
523
|
+
The `keepFor` grace period is measured from `MAX(updatedAt, lastFinalizedAt)`:
|
|
524
|
+
|
|
525
|
+
- **`updatedAt`**: When the source document's watched fields (`watchProjection`) last changed
|
|
526
|
+
- **`lastFinalizedAt`**: When a worker last completed or failed the task
|
|
527
|
+
|
|
528
|
+
This ensures tasks are protected if either:
|
|
529
|
+
1. The source data changed recently, OR
|
|
530
|
+
2. A worker processed the task recently
|
|
531
|
+
|
|
532
|
+
#### Example: Dynamic Filter Cleanup
|
|
554
533
|
|
|
555
534
|
```typescript
|
|
556
535
|
await reactiveTask({
|
|
557
|
-
task: '
|
|
558
|
-
collection: '
|
|
559
|
-
|
|
560
|
-
|
|
561
|
-
|
|
562
|
-
|
|
563
|
-
|
|
564
|
-
|
|
565
|
-
|
|
566
|
-
|
|
567
|
-
|
|
568
|
-
// It resets status to 'pending' and schedules it for future.
|
|
569
|
-
// It does NOT increment attempt count (it's not a failure).
|
|
570
|
-
context.deferCurrent(retryAfter * 1000);
|
|
571
|
-
return;
|
|
572
|
-
}
|
|
573
|
-
throw err; // Use standard retry policy for other errors
|
|
574
|
-
}
|
|
575
|
-
}
|
|
536
|
+
task: 'remind-pending-order',
|
|
537
|
+
collection: 'orders',
|
|
538
|
+
// Match orders pending for more than 24 hours
|
|
539
|
+
filter: { $expr: { $gt: ['$$NOW', { $add: ['$createdAt', 24 * 60 * 60 * 1000] }] } },
|
|
540
|
+
|
|
541
|
+
cleanupPolicy: {
|
|
542
|
+
deleteWhen: 'sourceDocumentDeletedOrNoLongerMatching',
|
|
543
|
+
keepFor: '1h', // Keep it at least 1 hour after last scheduled matching or finalization
|
|
544
|
+
},
|
|
545
|
+
|
|
546
|
+
handler: async (order) => { /* Send reminder email */ }
|
|
576
547
|
});
|
|
577
548
|
```
|
|
578
549
|
|
|
579
|
-
####
|
|
550
|
+
#### Scheduler-Level Configuration
|
|
580
551
|
|
|
581
|
-
|
|
552
|
+
Control how often the cleanup runs using `reactiveTaskCleanupInterval` in scheduler options. Cleanup is performed in **batches** (default 1000 items) to ensure stability on large datasets.
|
|
582
553
|
|
|
583
554
|
```typescript
|
|
584
|
-
|
|
555
|
+
await mongodash.init({
|
|
556
|
+
// ...
|
|
557
|
+
reactiveTaskCleanupInterval: '12h', // Run cleanup every 12 hours (default: '24h')
|
|
558
|
+
});
|
|
585
559
|
```
|
|
586
560
|
|
|
587
|
-
|
|
588
|
-
|
|
589
|
-
|
|
590
|
-
|
|
591
|
-
>
|
|
592
|
-
> **Result**: The load on the external service will not drop to zero immediately but will decrease gradually as individual instances hit the "circuit breaker".
|
|
561
|
+
Supported formats:
|
|
562
|
+
- Duration string: `'1h'`, `'24h'`, `'7d'`
|
|
563
|
+
- Milliseconds: `86400000`
|
|
564
|
+
- Cron expression: `'CRON 0 3 * * *'` (e.g., daily at 3 AM)
|
|
593
565
|
|
|
594
|
-
|
|
595
|
-
> **Current Task**
|
|
596
|
-
> `throttleAll` does not affect the currently running task. If you want to postpone the current task (so it counts as pending and retries after the pause), you must explicitly call `deferCurrent()`.
|
|
566
|
+
### Filter Evolution and Reconciliation
|
|
597
567
|
|
|
598
|
-
|
|
568
|
+
Reactive Tasks are designed to evolve with your application. As you deploy new versions of your code, you might change the `filter`, `watchProjection`, or the `handler` logic itself. The system automatically detects these changes and adapts the task state accordingly.
|
|
569
|
+
|
|
570
|
+
You can control this behavior using the optional `evolution` configuration:
|
|
599
571
|
|
|
600
572
|
```typescript
|
|
601
573
|
await reactiveTask({
|
|
602
|
-
task: '
|
|
603
|
-
collection: '
|
|
604
|
-
|
|
605
|
-
|
|
606
|
-
|
|
607
|
-
|
|
608
|
-
|
|
609
|
-
|
|
610
|
-
|
|
611
|
-
|
|
612
|
-
|
|
613
|
-
|
|
614
|
-
|
|
615
|
-
|
|
574
|
+
task: 'process-order',
|
|
575
|
+
collection: 'orders',
|
|
576
|
+
filter: { status: 'paid', amount: { $gt: 100 } },
|
|
577
|
+
|
|
578
|
+
// Logic Versioning
|
|
579
|
+
evolution: {
|
|
580
|
+
// Increment this when you change the handler code and want to re-process tasks
|
|
581
|
+
handlerVersion: 2,
|
|
582
|
+
|
|
583
|
+
// What to do when version increments?
|
|
584
|
+
// - 'none': Do nothing (default).
|
|
585
|
+
// - 'reprocess_failed': Reset all 'failed' tasks to 'pending' to retry with new code.
|
|
586
|
+
// - 'reprocess_all': Reset ALL tasks (even completed ones) to 'pending'.
|
|
587
|
+
onHandlerVersionChange: 'reprocess_failed',
|
|
588
|
+
|
|
589
|
+
// If 'filter' or 'watchProjection' changes, should we run reconciliation?
|
|
590
|
+
// Default: true
|
|
591
|
+
reconcileOnTriggerChange: true
|
|
592
|
+
},
|
|
616
593
|
|
|
617
|
-
|
|
618
|
-
context.deferCurrent(60 * 1000);
|
|
619
|
-
return;
|
|
620
|
-
}
|
|
621
|
-
throw err; // Standard retry policy for other errors
|
|
622
|
-
}
|
|
623
|
-
}
|
|
594
|
+
handler: async (order) => { /* ... */ }
|
|
624
595
|
});
|
|
625
596
|
```
|
|
626
597
|
|
|
598
|
+
#### 1. Trigger Evolution (Filter / Projection)
|
|
627
599
|
|
|
600
|
+
When the scheduler starts, it compares the current `filter` and `watchProjection` with the stored configuration from the previous deployment.
|
|
628
601
|
|
|
629
|
-
|
|
602
|
+
* **Narrowing the Filter** (e.g., `amount > 50` → `amount > 100`):
|
|
603
|
+
* **Pending Tasks**: Workers will pick up pending tasks. Before execution, they perform a "Late-Binding Check". If the document no longer matches the new filter (e.g. amount is 75), the task is **skipped** (completed) without running the handler.
|
|
604
|
+
* **Existing Tasks**: Tasks for documents that no longer match are **not deleted** immediately; they remain as history but won't satisfy the filter for future updates. See the cleanup policies for more details.
|
|
605
|
+
|
|
606
|
+
* **Widening the Filter** (e.g., `amount > 100` → `amount > 50`):
|
|
607
|
+
* **Reconciliation**: The system detects the filter change and automatically triggers a **Reconciliation** scan for this specific task.
|
|
608
|
+
* **Backfilling**: It scans the collection for documents that *now* match the new filter (e.g. amount 75) but don't have a task yet. It schedules new tasks for them immediately.
|
|
609
|
+
* *Note*: This ensures specific newly-matched documents get processed without needing a manual migration script.
|
|
610
|
+
|
|
611
|
+
> [!WARNING]
|
|
612
|
+
> **Dynamic Filters (e.g., `$$NOW`)**: If your filter uses time-based expressions to "widen" the range automatically over time (e.g. `{ $expr: { $lte: ['$releaseDate', '$$NOW'] } }`), this does **NOT** trigger reconciliation. The scheduler only detects changes to the *filter definition object*. Documents that match purely because time has passed (without a data change) will **not** be picked up. For time-based triggers, use a [Cron Task](./cron-tasks.md).
|
|
613
|
+
|
|
614
|
+
#### 2. Logic Evolution (Handler Versioning)
|
|
615
|
+
|
|
616
|
+
Sometimes you fix a bug in your handler and want to retry failed tasks, or you implement a new feature (e.g. generic data migration) and want to re-run the task for *every* document.
|
|
617
|
+
|
|
618
|
+
* **Versioning**: Increment `evolution.handlerVersion` (integer, default 1).
|
|
619
|
+
* **Policies (`onHandlerVersionChange`)**:
|
|
620
|
+
* `'none'`: The system acknowledges the new version but doesn't touch existing task states. New executions will use the new code.
|
|
621
|
+
* `'reprocess_failed'`: Finds all tasks currently in `failed` status and resets them to `pending` (resetting attempts count). Useful for bug fixes.
|
|
622
|
+
* `'reprocess_all'`: Resets **ALL** tasks (failed, completed) to `pending`. Useful for migrations or re-calculating data for the entire dataset.
|
|
623
|
+
|
|
624
|
+
> [!TIP]
|
|
625
|
+
> Use `reprocess_failed` for bug fixes and `reprocess_all` sparingly for data migrations. The system automatically handles the "reset" operation efficiently using database-side updates.
|
|
626
|
+
|
|
627
|
+
#### Reconciliation & Reliability
|
|
630
628
|
|
|
631
629
|
The system includes a self-healing mechanism called **Reconciliation**.
|
|
632
630
|
|
|
@@ -658,60 +656,113 @@ Reconciliation is **persistent and resilient**.
|
|
|
658
656
|
- If the `watchProjection` hasn't changed since the last run (comparing `lastObservedValues`), the task is **not** re-triggered.
|
|
659
657
|
- **Recommendation**: Carefully configure `filter` and `watchProjection` to minimize unnecessary processing during reconciliation.
|
|
660
658
|
|
|
661
|
-
|
|
659
|
+
### Change Detection and Storage Optimization
|
|
662
660
|
|
|
663
|
-
|
|
661
|
+
To ensure reliability and efficiency, the system needs to determine *when* to trigger a task.
|
|
664
662
|
|
|
665
|
-
|
|
663
|
+
**How it works:**
|
|
664
|
+
1. **State Persistence**: For every source document, a corresponding "task document" is stored in the `[collection]_tasks` collection.
|
|
665
|
+
2. **Snapshotting**: This task document holds a snapshot of the source document's fields (specifically, the result of `watchProjection`).
|
|
666
|
+
3. **Diffing**: When an event occurs (or during reconciliation), the system compares the current state of the document against the stored snapshot (`lastObservedValues`).
|
|
667
|
+
4. **No-Op**: If the watched fields haven't changed, **no task is triggered**. This guarantees reliability and prevents redundant processing.
|
|
668
|
+
|
|
669
|
+
**Storage Implications:**
|
|
670
|
+
- **Task Persistence**: The task document remains in the `_tasks` collection as long as the source document exists. It is only removed when the source document is deleted.
|
|
671
|
+
- **Optimization**: If `watchProjection` is **not defined**, the system copies the **entire source document** into the task document.
|
|
672
|
+
- **Recommendation**: For collections with **large documents** or **large datasets**, always define `watchProjection`. This significantly reduces storage usage and improves performance by copying only the necessary data subset.
|
|
673
|
+
- **Tip**: If you want to trigger the task on *any* change but avoid storing the full document, watch a versioning field like `updatedAt`, `lastModifiedAt`, or `_version`.
|
|
674
|
+
```typescript
|
|
675
|
+
// Triggers on any update (assuming your app updates 'updatedAt'),
|
|
676
|
+
// but stores ONLY the 'updatedAt' date in the tasks collection.
|
|
677
|
+
watchProjection: { updatedAt: 1 }
|
|
678
|
+
```
|
|
666
679
|
|
|
667
|
-
|
|
680
|
+
## Operations & Monitoring
|
|
668
681
|
|
|
669
|
-
|
|
670
|
-
2. **Reconciliation Recovery**: If task records are deleted (e.g. manual cleanup) but source documents remain, once a reconciliation runs, it recreates them as `pending`.
|
|
671
|
-
3. **Filter Re-matching** If a document is no longer matching the task filter, the task is deleted because the **sourceDocumentDeletedOrNoLongerMatching** cleanup policy is used and then the document is changed back again to match the task filter, the task will be recreated as `pending`.
|
|
672
|
-
4. **Explicit Reprocessing**: You might trigger re-execution manually (via `retryReactiveTasks`) or through schema evolution policies (`reprocess_all`).
|
|
682
|
+
### Task Management & DLQ
|
|
673
683
|
|
|
674
|
-
|
|
684
|
+
You can programmatically manage tasks, investigate failures, and handle Dead Letter Queues (DLQ) using the exported management API.
|
|
675
685
|
|
|
676
|
-
|
|
686
|
+
These functions allow you to build custom admin UIs or automated recovery workflows.
|
|
687
|
+
|
|
688
|
+
#### Listing Tasks
|
|
689
|
+
|
|
690
|
+
Use `getReactiveTasks` to inspect the queue. You can filter by task name, status, error message, or properties of the **source document**.
|
|
677
691
|
|
|
678
|
-
**Example**:
|
|
679
692
|
```typescript
|
|
680
|
-
|
|
681
|
-
// 1. Fetch document (with verification)
|
|
682
|
-
const order = await context.getDocument();
|
|
693
|
+
import { getReactiveTasks } from 'mongodash';
|
|
683
694
|
|
|
684
|
-
|
|
685
|
-
|
|
686
|
-
|
|
687
|
-
|
|
688
|
-
|
|
689
|
-
|
|
690
|
-
|
|
691
|
-
|
|
692
|
-
|
|
693
|
-
|
|
694
|
-
|
|
695
|
-
|
|
695
|
+
// list currently failed tasks
|
|
696
|
+
const failedTasks = await getReactiveTasks({
|
|
697
|
+
task: 'send-welcome-email',
|
|
698
|
+
status: 'failed'
|
|
699
|
+
});
|
|
700
|
+
|
|
701
|
+
// list with pagination
|
|
702
|
+
const page1 = await getReactiveTasks(
|
|
703
|
+
{ task: 'send-welcome-email' },
|
|
704
|
+
{ limit: 50, skip: 0, sort: { scheduledAt: -1 } }
|
|
705
|
+
);
|
|
706
|
+
|
|
707
|
+
// Advanced: Helper to find task by properties of the SOURCE document
|
|
708
|
+
// This is powerful: "Find the task associated with Order #123"
|
|
709
|
+
const orderTasks = await getReactiveTasks({
|
|
710
|
+
task: 'sync-order',
|
|
711
|
+
sourceDocFilter: { _id: 'order-123' }
|
|
712
|
+
});
|
|
713
|
+
|
|
714
|
+
// Advanced: Find tasks where source document matches complex filter
|
|
715
|
+
// "Find sync tasks for all VIP users"
|
|
716
|
+
const vipTasks = await getReactiveTasks({
|
|
717
|
+
task: 'sync-order',
|
|
718
|
+
sourceDocFilter: { isVip: true }
|
|
719
|
+
});
|
|
696
720
|
```
|
|
697
721
|
|
|
722
|
+
#### Counting Tasks
|
|
698
723
|
|
|
724
|
+
Use `countReactiveTasks` for metrics or UI badges.
|
|
699
725
|
|
|
700
|
-
|
|
726
|
+
```typescript
|
|
727
|
+
import { countReactiveTasks } from 'mongodash';
|
|
701
728
|
|
|
702
|
-
|
|
729
|
+
const dlqSize = await countReactiveTasks({
|
|
730
|
+
task: 'send-welcome-email',
|
|
731
|
+
status: 'failed'
|
|
732
|
+
});
|
|
733
|
+
```
|
|
703
734
|
|
|
704
|
-
|
|
705
|
-
|
|
706
|
-
|
|
707
|
-
|
|
708
|
-
|
|
709
|
-
|
|
710
|
-
|
|
735
|
+
#### Retrying Tasks
|
|
736
|
+
|
|
737
|
+
Use `retryReactiveTasks` to manually re-trigger tasks. This is useful for DLQ recovery after fixing a bug.
|
|
738
|
+
|
|
739
|
+
This operation is **concurrency-safe**. If a task is currently `processing`, it will be marked to re-run immediately after the current execution finishes (`processing_dirty`), ensuring no race conditions.
|
|
740
|
+
|
|
741
|
+
```typescript
|
|
742
|
+
import { retryReactiveTasks } from 'mongodash';
|
|
743
|
+
|
|
744
|
+
// Retry ALL failed tasks for a specific job
|
|
745
|
+
const result = await retryReactiveTasks({
|
|
746
|
+
task: 'send-welcome-email',
|
|
747
|
+
status: 'failed'
|
|
748
|
+
});
|
|
749
|
+
console.log(`Retried ${result.modifiedCount} tasks.`);
|
|
711
750
|
|
|
712
|
-
|
|
751
|
+
// Retry specific task by Source Document ID
|
|
752
|
+
await retryReactiveTasks({
|
|
753
|
+
task: 'sync-order',
|
|
754
|
+
sourceDocFilter: { _id: 'order-123' }
|
|
755
|
+
});
|
|
713
756
|
|
|
714
|
-
|
|
757
|
+
// Bulk Retry: Retry all tasks for "VIP" orders
|
|
758
|
+
// This efficiently finds matching tasks and schedules them for execution.
|
|
759
|
+
await retryReactiveTasks({
|
|
760
|
+
task: 'sync-order',
|
|
761
|
+
sourceDocFilter: { isVip: true }
|
|
762
|
+
});
|
|
763
|
+
```
|
|
764
|
+
|
|
765
|
+
### Monitoring
|
|
715
766
|
|
|
716
767
|
Mongodash provides built-in Prometheus metrics to monitor your reactive tasks.
|
|
717
768
|
|
|
@@ -721,7 +772,7 @@ Mongodash provides built-in Prometheus metrics to monitor your reactive tasks.
|
|
|
721
772
|
> npm install prom-client
|
|
722
773
|
> ```
|
|
723
774
|
|
|
724
|
-
|
|
775
|
+
#### Configuration
|
|
725
776
|
|
|
726
777
|
Monitoring is configured in the initialization options under the `monitoring` key:
|
|
727
778
|
|
|
@@ -730,8 +781,8 @@ await mongodash.init({
|
|
|
730
781
|
// ...
|
|
731
782
|
monitoring: {
|
|
732
783
|
enabled: true, // Default: true
|
|
733
|
-
pushIntervalMs: 60000, // How often instances synchronize metrics (default: 1m)
|
|
734
784
|
scrapeMode: 'cluster', // 'cluster' (default) or 'local'
|
|
785
|
+
pushIntervalMs: 60000, // How often instances synchronize metrics (default: 1m). Relevant only if scrapeMode is 'cluster'.
|
|
735
786
|
readPreference: 'secondaryPreferred' // 'primary', 'secondaryPreferred' etc.
|
|
736
787
|
}
|
|
737
788
|
});
|
|
@@ -741,7 +792,7 @@ await mongodash.init({
|
|
|
741
792
|
- `'cluster'` (Default): Returns aggregated system-wide metrics. Any instance can respond to this request (by fetching state from the DB). It aggregates metrics from all other active instances. (Recommended for Load Balancers / Heroku)
|
|
742
793
|
- `'local'`: Returns local metrics for THIS instance. If this instance is the Leader, it ALSO includes Global System Metrics (Queue Depth, Lag) so they are reported exactly once in the cluster. (Recommended for K8s Pod Monitors)
|
|
743
794
|
|
|
744
|
-
|
|
795
|
+
#### Retrieving Metrics
|
|
745
796
|
|
|
746
797
|
Expose the metrics endpoint (e.g., in Express):
|
|
747
798
|
|
|
@@ -760,7 +811,7 @@ app.get('/metrics', async (req, res) => {
|
|
|
760
811
|
});
|
|
761
812
|
```
|
|
762
813
|
|
|
763
|
-
|
|
814
|
+
#### Available Metrics
|
|
764
815
|
|
|
765
816
|
The system exposes the following metrics with standardized labels:
|
|
766
817
|
|
|
@@ -773,7 +824,7 @@ The system exposes the following metrics with standardized labels:
|
|
|
773
824
|
| `reactive_tasks_change_stream_lag_seconds` | Gauge | *none* | Time difference between now and the last processed Change Stream event. |
|
|
774
825
|
| `reactive_tasks_last_reconciliation_timestamp_seconds` | Gauge | *none* | Timestamp when the last full reconciliation (recovery) finished. |
|
|
775
826
|
|
|
776
|
-
|
|
827
|
+
#### Grafana Dashboard
|
|
777
828
|
|
|
778
829
|
A comprehensive **Grafana Dashboard** ("Reactive Tasks - System Overview") is included with the package.
|
|
779
830
|
|
|
@@ -788,90 +839,7 @@ You can find the dashboard JSON file at:
|
|
|
788
839
|
|
|
789
840
|
Import this file directly into Grafana to get started.
|
|
790
841
|
|
|
791
|
-
|
|
792
|
-
|
|
793
|
-
You can programmatically manage tasks, investigate failures, and handle Dead Letter Queues (DLQ) using the exported management API.
|
|
794
|
-
|
|
795
|
-
These functions allow you to build custom admin UIs or automated recovery workflows.
|
|
796
|
-
|
|
797
|
-
### Listing Tasks
|
|
798
|
-
|
|
799
|
-
Use `getReactiveTasks` to inspect the queue. You can filter by task name, status, error message, or properties of the **source document**.
|
|
800
|
-
|
|
801
|
-
```typescript
|
|
802
|
-
import { getReactiveTasks } from 'mongodash';
|
|
803
|
-
|
|
804
|
-
// list currently failed tasks
|
|
805
|
-
const failedTasks = await getReactiveTasks({
|
|
806
|
-
task: 'send-welcome-email',
|
|
807
|
-
status: 'failed'
|
|
808
|
-
});
|
|
809
|
-
|
|
810
|
-
// list with pagination
|
|
811
|
-
const page1 = await getReactiveTasks(
|
|
812
|
-
{ task: 'send-welcome-email' },
|
|
813
|
-
{ limit: 50, skip: 0, sort: { scheduledAt: -1 } }
|
|
814
|
-
);
|
|
815
|
-
|
|
816
|
-
// Advanced: Helper to find task by properties of the SOURCE document
|
|
817
|
-
// This is powerful: "Find the task associated with Order #123"
|
|
818
|
-
const orderTasks = await getReactiveTasks({
|
|
819
|
-
task: 'sync-order',
|
|
820
|
-
sourceDocFilter: { _id: 'order-123' }
|
|
821
|
-
});
|
|
822
|
-
|
|
823
|
-
// Advanced: Find tasks where source document matches complex filter
|
|
824
|
-
// "Find sync tasks for all VIP users"
|
|
825
|
-
const vipTasks = await getReactiveTasks({
|
|
826
|
-
task: 'sync-order',
|
|
827
|
-
sourceDocFilter: { isVip: true }
|
|
828
|
-
});
|
|
829
|
-
```
|
|
830
|
-
|
|
831
|
-
### Counting Tasks
|
|
832
|
-
|
|
833
|
-
Use `countReactiveTasks` for metrics or UI badges.
|
|
834
|
-
|
|
835
|
-
```typescript
|
|
836
|
-
import { countReactiveTasks } from 'mongodash';
|
|
837
|
-
|
|
838
|
-
const dlqSize = await countReactiveTasks({
|
|
839
|
-
task: 'send-welcome-email',
|
|
840
|
-
status: 'failed'
|
|
841
|
-
});
|
|
842
|
-
```
|
|
843
|
-
|
|
844
|
-
### Retrying Tasks
|
|
845
|
-
|
|
846
|
-
Use `retryReactiveTasks` to manually re-trigger tasks. This is useful for DLQ recovery after fixing a bug.
|
|
847
|
-
|
|
848
|
-
This operation is **concurrency-safe**. If a task is currently `processing`, it will be marked to re-run immediately after the current execution finishes (`processing_dirty`), ensuring no race conditions.
|
|
849
|
-
|
|
850
|
-
```typescript
|
|
851
|
-
import { retryReactiveTasks } from 'mongodash';
|
|
852
|
-
|
|
853
|
-
// Retry ALL failed tasks for a specific job
|
|
854
|
-
const result = await retryReactiveTasks({
|
|
855
|
-
task: 'send-welcome-email',
|
|
856
|
-
status: 'failed'
|
|
857
|
-
});
|
|
858
|
-
console.log(`Retried ${result.modifiedCount} tasks.`);
|
|
859
|
-
|
|
860
|
-
// Retry specific task by Source Document ID
|
|
861
|
-
await retryReactiveTasks({
|
|
862
|
-
task: 'sync-order',
|
|
863
|
-
sourceDocFilter: { _id: 'order-123' }
|
|
864
|
-
});
|
|
865
|
-
|
|
866
|
-
// Bulk Retry: Retry all tasks for "VIP" orders
|
|
867
|
-
// This efficiently finds matching tasks and schedules them for execution.
|
|
868
|
-
await retryReactiveTasks({
|
|
869
|
-
task: 'sync-order',
|
|
870
|
-
sourceDocFilter: { isVip: true }
|
|
871
|
-
});
|
|
872
|
-
```
|
|
873
|
-
|
|
874
|
-
## Graceful Shutdown
|
|
842
|
+
### Graceful Shutdown
|
|
875
843
|
|
|
876
844
|
When shutting down your application, call `stopReactiveTasks()` in your termination signal handlers to ensure in-progress tasks complete and resources are released cleanly.
|
|
877
845
|
|
|
@@ -912,3 +880,34 @@ process.on('SIGINT', () => gracefulShutdown('SIGINT')); // Ctrl+C
|
|
|
912
880
|
> [!NOTE]
|
|
913
881
|
> **Self-Healing Design**: While graceful shutdown is recommended best practice, the system is designed to be resilient. If your application crashes or is forcefully terminated, task locks will automatically expire after a timeout (default: 1 minute), allowing other instances to pick up and process the unfinished tasks. Similarly, leadership locks expire, ensuring another instance takes over. This guarantees eventual task processing even in failure scenarios.
|
|
914
882
|
|
|
883
|
+
## Architecture & Internals
|
|
884
|
+
|
|
885
|
+
### Architecture & Scalability
|
|
886
|
+
|
|
887
|
+
The system uses a **Leader-Worker** architecture to balance efficiency and scalability.
|
|
888
|
+
|
|
889
|
+
#### 1. The Leader (Planner)
|
|
890
|
+
- **Role**: A single instance is elected as the **Leader**.
|
|
891
|
+
- **Responsibility**: It listens to the MongoDB Change Stream, calculates the necessary tasks (based on `watchProjection`), and persists them into the `_tasks` collection. To minimize memory usage, it only fetches the document `_id` from the Change Stream event.
|
|
892
|
+
> [!NOTE]
|
|
893
|
+
> **Database Resolution**: The Change Stream is established on the database of the **first registered reactive task**.
|
|
894
|
+
- **Resilience**: Leadership is maintained via a distributed lock with a heartbeat. If the leader crashes, another instance automatically takes over (Failover).
|
|
895
|
+
|
|
896
|
+
#### 2. The Workers (Executors)
|
|
897
|
+
- **Role**: *Every* instance (including the leader) runs a set of **Worker** threads (managed by the event loop).
|
|
898
|
+
- **Responsibility**: Workers poll the `_tasks` collection for `pending` jobs, lock them, and execute the `handler`.
|
|
899
|
+
- **Adaptive Polling**: Workers use an **adaptive polling** mechanism.
|
|
900
|
+
- **Idle**: If no tasks are found, the polling frequency automatically lowers (saves CPU/IO).
|
|
901
|
+
- **Busy**: If tasks are found (or the **local** Leader signals new work), the frequency speeds up immediately to process the queue as fast as possible. Workers on other instances will speed up once they independently find a task during their regular polling.
|
|
902
|
+
|
|
903
|
+
### Task States & Lifecycle
|
|
904
|
+
|
|
905
|
+
Every task record in the `_tasks` collection follows a specific lifecycle:
|
|
906
|
+
|
|
907
|
+
| Status | Description |
|
|
908
|
+
| :--- | :--- |
|
|
909
|
+
| `pending` | Task is waiting to be processed by a worker. This is the initial state after scheduling or a re-trigger. |
|
|
910
|
+
| `processing` | Task is currently locked and being worked on by an active worker instance. |
|
|
911
|
+
| `processing_dirty` | **Concurrency Guard.** New data was detected while the worker was already processing the previous state. The task will be reset to `pending` immediately after the current run finishes to ensure no updates are missed. |
|
|
912
|
+
| `completed` | Task was processed successfully or it was not matching the filter during the last attempt. |
|
|
913
|
+
| `failed` | Task permanently failed after exceeding all retries or the `maxDuration` window. |
|