@arvo-tools/postgres 1.2.0 → 1.2.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (2) hide show
  1. package/README.md +442 -0
  2. package/package.json +2 -2
package/README.md ADDED
@@ -0,0 +1,442 @@
1
+ # @arvo-tools/postgres
2
+
3
+ **PostgreSQL-backed infrastructure for building scalable, reliable event-driven workflow orchestration systems in the Arvo ecosystem.**
4
+
5
+ [![npm version](https://badge.fury.io/js/%40arvo-tools%2Fpostgres.svg)](https://www.npmjs.com/package/@arvo-tools/postgres)
6
+ [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
7
+
8
+ This package provides two core components for distributed event-driven orchestration for Arvo-based components in your application:
9
+
10
+ ### PostgresEventBroker
11
+
12
+ - **Automatic Event Routing** - Routes ArvoEvents between handlers based on event destination
13
+ - **Persistent Queues** - PostgreSQL-backed job queues ensure no events are lost
14
+ - **Configurable Retry Logic** - Exponential backoff, retry limits, and dead letter queues
15
+ - **Workflow Completion Handling** - Register listeners for workflow completion events
16
+ - **Domained Event Support** - Handle special events requiring external interactions (human approvals, notifications)
17
+ - **OpenTelemetry Integration** - Distributed tracing across the entire event workflow
18
+ - **Queue Monitoring** - Built-in statistics for queue health and performance
19
+
20
+ ### PostgresMachineMemory
21
+
22
+ - **Persistent State Storage** - Workflow instance data stored in PostgreSQL
23
+ - **Optimistic Locking** - Version counters prevent concurrent state modification conflicts
24
+ - **Distributed Locking** - TTL-based locks with automatic expiration prevent deadlocks
25
+ - **Hierarchical Workflows** - Track parent-child relationships for complex orchestrations
26
+ - **Automatic Cleanup** - Optional removal of completed workflow data
27
+ - **Connection Pooling** - Efficient database connection management
28
+ - **OpenTelemetry Support** - Optional instrumentation for observability
29
+
30
+ ## Installation
31
+
32
+ This package is designed for Arvo-based components in your applications. To get the best value out of this package, you should use it in conjunction with [Arvo](https://www.arvo.land).
33
+
34
+ ```bash
35
+ pnpm install @arvo-tools/postgres
36
+ ```
37
+
38
+ ## Requirements
39
+
40
+ - Node.js >= 22.12.0
41
+ - PostgreSQL database
42
+ - Required database tables (see [Database Setup](#database-setup))
43
+
44
+ ## Database Setup
45
+
46
+ This package provides an abstraction layer on top of your PostgreSQL database so that the event handlers and orchestrators in Arvo can leverage the database to distribute events and persist their state for durable execution.
47
+
48
+ The `PostgresMachineMemory` requires tables to store and organize the state of the event handlers and orchestrators. The method `connectPostgresMachineMemory` discussed below automatically creates the required tables in your PostgreSQL database. However, if you are unable to provide it write permission, you can refer to the table schema documentation to deploy the tables manually:
49
+
50
+ - **[Version 1](./src/memory/v1/README.md)**
51
+
52
+
53
+ The `PostgresEventBroker` (built on PgBoss) will automatically create its required tables on first connection. You can view the [pg-boss documentation](https://timgit.github.io/pg-boss/#/) for its migration pattern.
54
+
55
+ ## Usage
56
+
57
+ ### PostgresMachineMemory
58
+
59
+ The orchestrators in Arvo, namely `ArvoOrchestrator` and `ArvoResumable`, require a memory backend to persist their state for distributed event-driven operations.
60
+
61
+ #### Basic Setup
62
+
63
+ ```typescript
64
+ import {
65
+ connectPostgresMachineMemory,
66
+ releasePostgressMachineMemory
67
+ } from '@arvo-tools/postgres';
68
+ import {
69
+ type IMachineMemory,
70
+ type EventHandlerFactory,
71
+ createArvoOrchestrator
72
+ } from 'arvo-event-handler';
73
+
74
+ // Establish a connection to postgres for machine memory operations
75
+ const memory = await connectPostgresMachineMemory({
76
+ version: 1,
77
+ config: {
78
+ connectionString: process.env.POSTGRES_CONNECTION_STRING,
79
+ }
80
+ migrate: 'if_tables_dont_exist',
81
+ });
82
+
83
+ // Create an ArvoOrchestrator with the memory interface for dependency injection
84
+ const orchestratorHandler: EventHandlerFactory<{ memory: IMachineMemory }> = ({ memory }) => createArvoOrchestrator({
85
+ // ... your orchestrator config
86
+ memory: memory
87
+ });
88
+
89
+ const orchestrator = orchestratorHandler({memory})
90
+
91
+ // Always release when done
92
+ await releasePostgressMachineMemory(memory);
93
+ ```
94
+
95
+
96
+ This example demonstrates connecting the PostgreSQL machine memory with a typical Arvo event handler (in this case `ArvoOrchestrator`). The `connectPostgresMachineMemory` takes in a `version` parameter to establish the table structure which will be used to persist the state. This allows for safe package versioning without requiring complex table migrations from your deployment. The table migrations will be rolled out based on this `version` while the implementation updates will be rolled out as per the package versions.
97
+
98
+ The `migrate` field provides a mechanism for you to configure the migration behavior. It tells the connection that if no tables are available, then create them before establishing the connection. By default this field is `'noop'` which results in no migration actions at all.
99
+
100
+ Once the memory has been defined and established, you can pass it to any Arvo event handler which is able to use it, and that's it.
101
+
102
+ #### Advanced Configuration
103
+
104
+ For production environments or specific use cases, you can configure the PostgreSQL machine memory with advanced settings including custom table names, connection pooling, distributed locking behavior, and observability features.
105
+
106
+ ```typescript
107
+ const memory = await connectPostgresMachineMemory({
108
+ version: 1,
109
+
110
+ // Custom table names (optional)
111
+ tables: {
112
+ state: 'custom_state_table',
113
+ lock: 'custom_lock_table',
114
+ hierarchy: 'custom_hierarchy_table'
115
+ },
116
+
117
+ config: {
118
+ // Connection via connection string
119
+ connectionString: process.env.POSTGRES_CONNECTION_STRING,
120
+
121
+ // OR via individual parameters
122
+ // host: 'localhost',
123
+ // port: 5432,
124
+ // user: 'postgres',
125
+ // password: 'postgres',
126
+ // database: 'mydb',
127
+
128
+ // Connection pool settings
129
+ max: 20, // Maximum pool size (default: 10)
130
+ idleTimeoutMillis: 30000, // Idle client timeout (default: 30000)
131
+ connectionTimeoutMillis: 5000, // Connection acquisition timeout (default: 5000)
132
+ statementTimeoutMillis: 30000, // Statement execution timeout (optional)
133
+ queryTimeoutMillis: 30000, // Query execution timeout (optional)
134
+
135
+ // Distributed lock configuration
136
+ lockConfig: {
137
+ maxRetries: 5, // Lock acquisition retry attempts (default: 3)
138
+ initialDelayMs: 50, // Initial retry delay (default: 100)
139
+ backoffExponent: 2, // Exponential backoff multiplier (default: 1.5)
140
+ ttlMs: 180000 // Lock TTL in milliseconds (default: 120000)
141
+ },
142
+
143
+ // Feature flags
144
+ enableCleanup: true, // Auto-cleanup completed workflows (default: false)
145
+ enableOtel: true // OpenTelemetry tracing (default: false)
146
+ },
147
+
148
+ // Migration strategy
149
+ migrate: 'create_if_not_exists' // Options: 'noop' | 'create_if_not_exists' | 'dangerousely_force_migration'
150
+ });
151
+ ```
152
+
153
+ **Migration Strategies:**
154
+
155
+ - **`'noop'` (default)** - No migration actions. Tables must already exist or connection will fail during validation.
156
+ - **`'create_if_not_exists'`** - Creates tables if they don't exist. Safe for production use.
157
+ - **`'dangerousely_force_migration'`** - Drops and recreates all tables, destroying existing data. Use only in development/testing environments.
158
+
159
+ **Lock Configuration:**
160
+
161
+ Configure lock behavior based on your workflow characteristics. Longer-running workflows need higher `ttlMs` values to prevent premature lock expiration. Increase `maxRetries` and adjust `backoffExponent` for high-contention scenarios where multiple processes compete for the same workflow locks. The defaults in Arvo and in this package are set which are appropriate for 95% of the usecases.
162
+
163
+ ### PostgresEventBroker
164
+
165
+ Your PostgreSQL database can be further leveraged to establish a robust event broker for Arvo event handlers. Conceptually, each event handler you register gets its own dedicated task queue, providing isolated processing channels for different parts of your workflow. When an event is emitted in this broker, an intelligent event router inspects the `event.to` field and routes it to the appropriate handler's queue for processing. This ensures reliable, ordered delivery of events to their intended destinations.
166
+
167
+ This implementation utilizes `PgBoss` as the foundational job queue mechanism, providing battle-tested reliability, persistence, and retry capabilities. The `PostgresEventBroker` extends the `PgBoss` class to add Arvo-specific functionality such as automatic event routing, workflow completion handling, and domained event support. This design makes integration with your existing Arvo event handlers seamless and frictionless, requiring minimal code changes while gaining the benefits of PostgreSQL-backed reliability and scalability.
168
+
169
+ #### Basic Setup
170
+
171
+ ```typescript
172
+ import { PostgresEventBroker } from '@arvo-tools/postgres';
173
+ import { createArvoEventFactory } from 'arvo-core';
174
+
175
+ // Initialize broker
176
+ const broker = new PostgresEventBroker({
177
+ connectionString: 'postgresql://user:password@localhost:5432/mydb'
178
+ });
179
+
180
+ await broker.start();
181
+
182
+ // Set up workflow completion handler
183
+ await broker.onWorkflowComplete({
184
+ source: 'my.workflow',
185
+ listener: async (event) => {
186
+ console.log('Workflow completed:', event.data);
187
+ },
188
+ options: {
189
+ worker: {
190
+ concurrency: 5
191
+ }
192
+ }
193
+ });
194
+
195
+ // Register event handlers
196
+ await broker.register(myHandler, {
197
+ recreateQueue: true,
198
+ queue: {
199
+ deadLetter: 'my_dlq'
200
+ },
201
+ worker: {
202
+ concurrency: 10,
203
+ retryLimit: 3,
204
+ retryBackoff: true,
205
+ pollingIntervalSeconds: 2
206
+ }
207
+ });
208
+
209
+ // Dispatch events
210
+ const event = createArvoEventFactory(myContract.version('1.0.0')).accepts({
211
+ source: 'my.workflow',
212
+ data: { value: 42 }
213
+ });
214
+
215
+ await broker.dispatch(event);
216
+ ```
217
+
218
+ #### Handler Registration with Retry Configuration
219
+
220
+ ```typescript
221
+ await broker.register(calculatorHandler, {
222
+ recreateQueue: true,
223
+ queue: {
224
+ policy: 'standard',
225
+ deadLetter: 'calculator_dlq',
226
+ warningQueueSize: 1000
227
+ },
228
+ worker: {
229
+ concurrency: 5,
230
+ retryLimit: 5,
231
+ retryBackoff: true,
232
+ retryDelay: 10, // 10 seconds
233
+ retryDelayMax: 300, // 5 minutes max
234
+ expireInSeconds: 900, // 15 minutes timeout
235
+ pollingIntervalSeconds: 2
236
+ }
237
+ });
238
+ ```
239
+
240
+ #### Handling Domained Events
241
+
242
+ ```typescript
243
+ // Handle events that require external system interaction
244
+ broker.onDomainedEvent(async (event) => {
245
+ if (event.domain === 'human.interaction') {
246
+ await notificationService.send(event.data);
247
+ } else if (event.domain === 'external.api') {
248
+ await externalAPI.process(event.data);
249
+ }
250
+ });
251
+ ```
252
+
253
+ #### Custom Error Handling
254
+
255
+ ```typescript
256
+ // Handle events with no registered destination
257
+ broker.onHandlerNotFound(async (event) => {
258
+ logger.error('No handler found for event:', {
259
+ eventType: event.type,
260
+ destination: event.to,
261
+ source: event.source
262
+ });
263
+ await alertingService.notify('Unrouted event detected');
264
+ });
265
+ ```
266
+
267
+ #### Custom Logger
268
+
269
+ ```typescript
270
+ import winston from 'winston';
271
+
272
+ const logger = winston.createLogger({
273
+ level: 'info',
274
+ format: winston.format.json(),
275
+ transports: [
276
+ new winston.transports.Console(),
277
+ new winston.transports.File({ filename: 'broker.log' })
278
+ ]
279
+ });
280
+
281
+ broker.setLogger(logger);
282
+ ```
283
+
284
+ #### Queue Monitoring
285
+
286
+ ```typescript
287
+ // Get statistics for all queues
288
+ const stats = await broker.getStats();
289
+
290
+ stats.forEach(stat => {
291
+ console.log(`Queue: ${stat.name}`);
292
+ console.log(` Active: ${stat.activeCount}`);
293
+ console.log(` Queued: ${stat.queuedCount}`);
294
+ console.log(` Total: ${stat.totalCount}`);
295
+ });
296
+ ```
297
+
298
+ #### Cleanup
299
+
300
+ ```typescript
301
+ // Stop broker and clean up resources
302
+ await broker.stop();
303
+ ```
304
+
305
+
306
+ ### Configuration Reference
307
+
308
+ #### PostgresEventBroker Options
309
+
310
+ Extends PgBoss configuration. See [PgBoss documentation](https://github.com/timgit/pg-boss) for full options.
311
+
312
+ ```typescript
313
+ new PostgresEventBroker({
314
+ connectionString: string,
315
+ // ... or individual connection params
316
+ host?: string,
317
+ port?: number,
318
+ database?: string,
319
+ user?: string,
320
+ password?: string,
321
+
322
+ // PgBoss options
323
+ schema?: string,
324
+ max?: number,
325
+ // ... see PgBoss docs for more
326
+ })
327
+ ```
328
+
329
+ #### Handler Registration Options
330
+
331
+ ```typescript
332
+ {
333
+ recreateQueue?: boolean, // Delete and recreate queue
334
+
335
+ queue?: {
336
+ policy?: 'standard' | 'short' | 'singleton' | 'stately',
337
+ partition?: boolean,
338
+ deadLetter?: string,
339
+ warningQueueSize?: number
340
+ },
341
+
342
+ worker?: {
343
+ // Worker config
344
+ concurrency?: number, // Number of workers (default: 1)
345
+ pollingIntervalSeconds?: number, // Polling interval (default: 2)
346
+
347
+ // Job options
348
+ priority?: number,
349
+ retryLimit?: number, // Number of retries (default: 2)
350
+ retryDelay?: number, // Delay between retries in seconds
351
+ retryBackoff?: boolean, // Exponential backoff (default: false)
352
+ retryDelayMax?: number, // Max delay for backoff
353
+ expireInSeconds?: number, // Job timeout (default: 15 min)
354
+ retentionSeconds?: number, // How long to keep jobs (default: 14 days)
355
+ deleteAfterSeconds?: number, // Delete after completion (default: 7 days)
356
+ startAfter?: number | string | Date, // Delay job start
357
+ singletonSeconds?: number, // Throttle to one job per interval
358
+ singletonNextSlot?: boolean,
359
+ singletonKey?: string
360
+ }
361
+ }
362
+ ```
363
+
364
+ ## API Reference
365
+
366
+ ### PostgresEventBroker
367
+
368
+ #### Methods
369
+
370
+ - `start()` - Start the broker
371
+ - `stop()` - Stop the broker and clean up resources
372
+ - `register(handler, options?)` - Register an event handler
373
+ - `onWorkflowComplete({ source, listener, options? })` - Register workflow completion handler
374
+ - `dispatch(event)` - Dispatch an event into the system
375
+ - `onHandlerNotFound(listener)` - Handle unroutable events
376
+ - `onDomainedEvent(listener)` - Handle domained events
377
+ - `setLogger(logger)` - Set custom logger
378
+ - `getStats()` - Get queue statistics
379
+ - `queues` - Get array of registered queue names
380
+
381
+ ### PostgresMachineMemory
382
+
383
+ #### Methods
384
+
385
+ - `read(id)` - Read workflow state
386
+ - `write(id, data, prevData, metadata)` - Write workflow state with optimistic locking
387
+ - `lock(id)` - Acquire distributed lock
388
+ - `unlock(id)` - Release distributed lock
389
+ - `cleanup(id)` - Remove workflow data
390
+ - `getSubjectsByRoot(rootSubject)` - Get all child workflow subjects
391
+ - `getRootSubject(subject)` - Get root workflow subject
392
+ - `close()` - Close connection pool
393
+ - `validateTableStructure()` - Validate database schema
394
+
395
+ ### Factory Functions
396
+
397
+ - `connectPostgresMachineMemory(params)` - Create and validate machine memory instance
398
+ - `releasePostgressMachineMemory(memory)` - Release machine memory resources
399
+
400
+ ## Troubleshooting
401
+
402
+ ### "Table does not exist" errors
403
+
404
+ Ensure all three tables are created before connecting. Run the factory function with \`migrate\` parameter, SQL schema, or Prisma migration.
405
+
406
+ ### Events not being processed
407
+
408
+ - Check that handlers are registered: `broker.queues`
409
+ - Verify workflow completion handler is set up
410
+ - Check queue statistics: `await broker.getStats()`
411
+ - Review logs for routing errors
412
+
413
+ ### Lock acquisition failures
414
+
415
+ - Increase `maxRetries` or `ttlMs`
416
+ - Check for deadlocks in application logic
417
+ - Monitor lock table for expired locks not being cleaned up
418
+
419
+ ### Memory leaks
420
+
421
+ - Always call `broker.stop()` and `releasePostgressMachineMemory()`
422
+
423
+ ## Contributing
424
+
425
+ Contributions are welcome! Please see the [main repository](https://github.com/SaadAhmad123/arvo-tools) for contribution guidelines.
426
+
427
+ ## Links
428
+
429
+ - [GitHub Repository](https://github.com/SaadAhmad123/arvo-tools)
430
+ - [Arvo Documentation](https://www.arvo.land)
431
+ - [PgBoss Documentation](https://github.com/timgit/pg-boss)
432
+ - [Issue Tracker](https://github.com/SaadAhmad123/arvo-tools/issues)
433
+
434
+ ## Support
435
+
436
+ For questions and support:
437
+ - Open an issue on [GitHub](https://github.com/SaadAhmad123/arvo-tools/issues)
438
+ - Check the [Arvo documentation](https://www.arvo.land)
439
+
440
+ ## Changelog
441
+
442
+ See [CHANGELOG.md](./CHANGELOG.md) for version history and changes.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@arvo-tools/postgres",
3
- "version": "1.2.0",
3
+ "version": "1.2.2",
4
4
  "description": "The official package for Arvo's execution components in Postgres",
5
5
  "main": "dist/index.js",
6
6
  "types": "dist/index.d.ts",
@@ -28,6 +28,7 @@
28
28
  },
29
29
  "dependencies": {
30
30
  "arvo-core": "3.0.28",
31
+ "arvo-event-handler": "3.0.28",
31
32
  "pg": "8.16.3",
32
33
  "pg-boss": "12.5.4",
33
34
  "pg-format": "1.0.4",
@@ -38,7 +39,6 @@
38
39
  "@types/pg": "8.16.0",
39
40
  "@types/pg-format": "1.0.2",
40
41
  "@vitest/coverage-istanbul": "4.0.13",
41
- "arvo-event-handler": "3.0.28",
42
42
  "vitest": "4.0.13"
43
43
  },
44
44
  "engines": {