@batchactions/state-sequelize 0.0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md ADDED
@@ -0,0 +1,109 @@
1
+ # @batchactions/state-sequelize
2
+
3
+ Sequelize-based `StateStore` and `DistributedStateStore` adapter for [@batchactions/core](https://www.npmjs.com/package/@batchactions/core).
4
+
5
+ Persists import job state, processed records, and distributed batch metadata to any relational database supported by Sequelize v6 (PostgreSQL, MySQL, MariaDB, SQLite, MS SQL Server).
6
+
7
+ ## Installation
8
+
9
+ ```bash
10
+ npm install @batchactions/state-sequelize
11
+ ```
12
+
13
+ **Peer dependencies:** `@batchactions/core` (>=0.1.0) and `sequelize` (^6.0.0) must be installed in your project.
14
+
15
+ ## Usage
16
+
17
+ ```typescript
18
+ import { BulkImport, CsvParser } from '@batchactions/import';
19
+ import { BufferSource } from '@batchactions/core';
20
+ import { SequelizeStateStore } from '@batchactions/state-sequelize';
21
+ import { Sequelize } from 'sequelize';
22
+
23
+ // Use your existing Sequelize instance
24
+ const sequelize = new Sequelize('postgres://user:pass@localhost:5432/mydb');
25
+
26
+ // Create and initialize the store (creates tables if they don't exist)
27
+ const stateStore = new SequelizeStateStore(sequelize);
28
+ await stateStore.initialize();
29
+
30
+ // Pass it to BulkImport
31
+ const importer = new BulkImport({
32
+ schema: { fields: [/* ... */] },
33
+ batchSize: 500,
34
+ continueOnError: true,
35
+ stateStore,
36
+ });
37
+
38
+ importer.from(new BufferSource(csvString), new CsvParser());
39
+
40
+ await importer.start(async (record) => {
41
+ await sequelize.models.User.create(record);
42
+ });
43
+ ```
44
+
45
+ ## Database Tables
46
+
47
+ The adapter creates three tables:
48
+
49
+ - **`bulkimport_jobs`** -- Import job state (status, config, batches as JSON, distributed flag)
50
+ - **`bulkimport_records`** -- Individual processed records (status, raw/parsed data, errors)
51
+ - **`bulkimport_batches`** -- Batch metadata for distributed processing (status, workerId, version for optimistic locking)
52
+
53
+ Tables are created automatically when you call `initialize()`. The call is idempotent.
54
+
55
+ ## Distributed Processing
56
+
57
+ `SequelizeStateStore` fully implements the `DistributedStateStore` interface, enabling multi-worker parallel processing with [`@batchactions/distributed`](https://www.npmjs.com/package/@batchactions/distributed).
58
+
59
+ ```bash
60
+ npm install @batchactions/distributed
61
+ ```
62
+
63
+ ```typescript
64
+ import { DistributedImport } from '@batchactions/distributed';
65
+ import { SequelizeStateStore } from '@batchactions/state-sequelize';
66
+
67
+ const stateStore = new SequelizeStateStore(sequelize);
68
+ await stateStore.initialize();
69
+
70
+ const di = new DistributedImport({
71
+ schema: { fields: [/* ... */] },
72
+ batchSize: 500,
73
+ stateStore,
74
+ });
75
+
76
+ // Orchestrator: prepare the job
77
+ const { jobId, totalBatches } = await di.prepare(source, parser);
78
+
79
+ // Worker: claim and process batches
80
+ const result = await di.processWorkerBatch(jobId, processor, workerId);
81
+ ```
82
+
83
+ ### Distributed Features
84
+
85
+ | Feature | Description |
86
+ |---|---|
87
+ | **Atomic batch claiming** | `claimBatch()` uses transactions + optimistic locking (`version` column) to ensure no two workers claim the same batch |
88
+ | **Stale batch recovery** | `reclaimStaleBatches(timeoutMs)` resets batches stuck in PROCESSING beyond the timeout |
89
+ | **Exactly-once finalization** | `tryFinalizeJob()` atomically transitions the job to COMPLETED/FAILED only once |
90
+ | **Batch record storage** | `saveBatchRecords()` / `getBatchRecords()` for bulk record persistence |
91
+ | **Distributed status** | `getDistributedStatus()` aggregates batch counts by status |
92
+
93
+ ### Recommended Databases for Distributed Mode
94
+
95
+ | Database | Row Locking | Recommended |
96
+ |---|---|---|
97
+ | PostgreSQL | `FOR UPDATE SKIP LOCKED` | Yes |
98
+ | MySQL 8+ | `FOR UPDATE SKIP LOCKED` | Yes |
99
+ | MariaDB 10.6+ | `FOR UPDATE SKIP LOCKED` | Yes |
100
+ | SQLite | Single-writer (no concurrent transactions) | Dev/test only |
101
+
102
+ ## Limitations
103
+
104
+ - Schema fields containing non-serializable values (`customValidator`, `transform`, `pattern`) are stripped when saving to the database. When restoring a job, the consumer must re-inject these fields.
105
+ - SQLite does not support concurrent transactions, so distributed batch claiming is limited to sequential use in tests. Use PostgreSQL or MySQL for production distributed processing.
106
+
107
+ ## License
108
+
109
+ MIT