@rotorsoft/act 0.20.0 → 0.22.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -194,63 +194,21 @@ Cache and snapshots are the same checkpoint pattern at different layers:
194
194
 
195
195
  On cache hit, snapshot events in the store are skipped (`with_snaps: false`). On cache miss, the store is queried with `with_snaps: true` to find the latest snapshot and replay only events after it.
196
196
 
197
- #### Why cache on every commit, not just on snap?
198
-
199
- An alternative design would update the cache only at snap boundaries — since snap and cache are the same checkpoint concept. We benchmarked both strategies to test this theory.
200
-
201
- **Cache on every commit** (current — `action()` updates cache after every successful commit):
202
-
203
- | Events | No snap | @10 | @50 | @75 | @100 |
204
- |---:|---:|---:|---:|---:|---:|
205
- | **50** | 4,872 | 5,881 | 6,480 | **7,058** | 6,949 |
206
- | **500** | **6,371** | 5,639 | 5,590 | 6,223 | 5,488 |
207
- | **2,000** | 4,257 | **5,329** | 4,573 | 4,812 | 4,039 |
208
-
209
- **Cache only on snap** (alternative — cache is only populated when `snap()` fires):
210
-
211
- | Events | No snap | @10 | @50 | @75 | @100 |
212
- |---:|---:|---:|---:|---:|---:|
213
- | **50** | 608 | 5,845 | 6,098 | 694 | 1,006 |
214
- | **500** | 212 | **6,481** | 4,955 | 570 | 5,074 |
215
- | **2,000** | 101 | **6,827** | 5,993 | 675 | 4,039 |
216
-
217
- The snap-only strategy has three critical problems:
218
-
219
- 1. **States without `.snap()` get no cache at all** — the "no snap" column falls back to full PG replay on every `load()`, showing the same 608→101 ops/s degradation as the pre-cache baseline. Any state that doesn't configure snapping loses all caching benefit.
220
-
221
- 2. **Cache misses between snap boundaries** — with snap@75, cache is only populated every 75 events. After seeding 50 events, no snap has fired yet, so the cache is empty. The 50-event @75 result (694 ops/s) is barely better than no cache. At 500 events, @75 only fires 6 times — after seeding, the cache holds the state from the last snap point, and `load()` must replay up to 74 tail events from the store.
222
-
223
- 3. **Fire-and-forget race conditions** — `snap()` is async (`void snap(last)`). Without the cache absorbing the version, the next `action()` call's `load()` races with the snap commit. If the `__snapshot__` event lands in the store before `load()` runs, the expected version shifts, causing `ERR_CONCURRENCY` errors. This makes seeding unreliable without artificial delays between snap boundaries.
224
-
225
- The snap-only results that match the every-commit numbers (@10 at 500/2000 events) are cases where the cache happens to be warm from a recent snap. But this is fragile — it depends on the stream length being a favorable multiple of the snap interval.
226
-
227
- **Conclusion:** Cache on every commit is the right design. The cost of a `Map.set()` per commit is negligible, but the benefit is absolute: every `load()` after the first action is a guaranteed cache hit with zero store replay, regardless of whether snap is configured.
228
-
229
- > **InMemoryStore note:** InMemory benchmarks show ~830 ops/s across all configurations because every `InMemoryStore` method starts with `await sleep(0)` (`setTimeout(resolve, 0)`) to simulate async behavior. This event-loop yield costs ~1ms per call, capping throughput at ~1,000 ops/s. PG's indexed query for 0 new events returns in ~0.15ms.
230
-
231
- **Compared to pre-cache baselines** (PG, no cache):
232
-
233
- | Events | Without cache | With cache | Speedup |
234
- |---:|---:|---:|---:|
235
- | **50** | 655 | 4,872 | **7×** |
236
- | **500** | 215 | 6,371 | **30×** |
237
- | **2,000** | 92 | 4,257 | **46×** |
238
-
239
- Without cache, every `load()` replays the full event stream from PG — throughput degrades linearly with stream length (655 → 92 ops/s). With always-on cache, throughput is flat (~4,000–7,000 ops/s) regardless of stream length.
240
-
241
197
  ### Performance Considerations
242
198
 
243
- - **Cache is always-on** — warm reads skip the store entirely, delivering consistent throughput regardless of stream length. No configuration needed.
199
+ - **Cache is always-on** — warm reads skip the store entirely, delivering consistent throughput (7-46x faster than uncached). No configuration needed.
244
200
  - **Use snapshots for cold-start resilience** — on process restart or LRU eviction, snaps limit how much of the event stream must be replayed. Set `.snap((s) => s.patches >= 50)` for most use cases.
245
201
  - **Cache invalidation is automatic** — concurrency errors (`ERR_CONCURRENCY`) invalidate the stale cache entry, forcing a fresh load from the store on the next access.
246
202
  - **Snap writes are fire-and-forget** — `snap()` commits to the store asynchronously after `action()` returns. The cache is updated synchronously within `action()`, so subsequent reads see the post-snap state immediately without waiting for the store write.
203
+ - **Atomic claim eliminates poll→lease overhead** — `claim()` fuses discovery and locking into a single SQL transaction using `FOR UPDATE SKIP LOCKED`, saving one round-trip per drain cycle and eliminating contention between workers.
247
204
  - Events are indexed by stream and version for fast lookups, with additional indexes on timestamps and correlation IDs.
248
205
  - The PostgreSQL adapter supports connection pooling and partitioning for high-volume deployments.
249
- - Active event streams remain in fast storage; consider archival strategies for very large datasets.
206
+
207
+ For detailed benchmark data and performance evolution history, see [PERFORMANCE.md](PERFORMANCE.md).
250
208
 
251
209
  ## Event-Driven Processing
252
210
 
253
- Act handles event-driven workflows through stream leasing and correlation, ensuring ordered, non-duplicated event processing without external message queues. The event store itself acts as the message backbone — events are written once and consumed by multiple independent reaction handlers.
211
+ Act handles event-driven workflows through atomic stream claiming and correlation, ensuring ordered, non-duplicated event processing without external message queues. The event store itself acts as the message backbone — events are written once and consumed by multiple independent reaction handlers.
254
212
 
255
213
  ### Reactions
256
214
 
@@ -268,27 +226,28 @@ const app = act()
268
226
 
269
227
  Resolvers dynamically determine which stream a reaction targets, enabling flexible event routing without hardcoded dependencies. They can include source regex patterns to limit which streams trigger the reaction.
270
228
 
271
- ### Stream Leasing
229
+ ### Stream Claiming
272
230
 
273
- Rather than processing events immediately, Act uses a leasing mechanism to coordinate distributed consumers. The application fetches events and pushes them to reaction handlers by leasing correlated streams:
231
+ Rather than processing events immediately, Act uses an atomic claim mechanism to coordinate distributed consumers. The `claim()` method atomically discovers and locks streams in a single operation using PostgreSQL's `FOR UPDATE SKIP LOCKED` pattern — competing consumers never block each other, and locked rows are silently skipped. This is the same pattern used by pgBoss, Graphile Worker, and other production job queues.
274
232
 
275
233
  - **Per-stream ordering** — Events within a stream are processed sequentially.
276
- - **Temporary ownership** — Leases expire after a configurable duration, allowing re-processing if a consumer fails.
277
- - **Backpressure** — Only a limited number of leases can be active at a time, preventing consumer overload.
234
+ - **Temporary ownership** — Claims expire after a configurable duration, allowing re-processing if a consumer fails.
235
+ - **Zero-contention** — `FOR UPDATE SKIP LOCKED` means workers never block each other; locked rows are silently skipped.
236
+ - **Backpressure** — Only a limited number of claims can be active at a time, preventing consumer overload.
278
237
 
279
- If a lease expires due to failure, the stream is automatically re-leased to another consumer, ensuring no event is permanently lost.
238
+ If a claim expires due to failure, the stream is automatically re-claimed by another consumer, ensuring no event is permanently lost.
280
239
 
281
240
  ### Event Correlation
282
241
 
283
242
  Act tracks causation chains across actions and reactions using correlation metadata:
284
243
 
285
244
  - Each action/event carries a `correlation` ID (request trace) and `causation` ID (what triggered it).
286
- - Reactions can discover new streams to process by querying uncommitted events with matching correlation IDs.
245
+ - `app.correlate()` scans events, discovers new target streams via reaction resolvers, and registers them with `subscribe()`. It returns `{ subscribed, last_id }` where `subscribed` is the count of newly registered streams.
287
246
  - This enables full workflow tracing — from the initial user action through every downstream reaction.
288
247
 
289
248
  ```typescript
290
- // Correlate events to discover new streams for processing
291
- await app.correlate();
249
+ // Correlate events to discover and subscribe new streams for processing
250
+ const { subscribed, last_id } = await app.correlate();
292
251
 
293
252
  // Or run periodic background correlation
294
253
  app.start_correlations();
@@ -315,12 +274,12 @@ app.settle();
315
274
 
316
275
  // Subscribe to the "settled" lifecycle event
317
276
  app.on("settled", (drain) => {
318
- // drain has { fetched, leased, acked, blocked }
277
+ // drain has { fetched, claimed, acked, blocked }
319
278
  // notify SSE clients, update caches, etc.
320
279
  });
321
280
  ```
322
281
 
323
- Drain cycles continue until all reactions have caught up to the latest events. Consumers only process new work — acknowledged events are skipped, and failed events are re-leased automatically.
282
+ Drain cycles continue until all reactions have caught up to the latest events. Consumers only process new work — acknowledged events are skipped, and failed streams are re-claimed automatically.
324
283
 
325
284
  The `settle()` method is the recommended production pattern — it debounces rapid commits (10ms default), runs correlate→drain in a loop until the system is consistent, and emits a `"settled"` event when done.
326
285