@drarzter/kafka-client 0.6.7 → 0.7.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -20,6 +20,7 @@ Type-safe Kafka client for Node.js. Framework-agnostic core with a first-class N
20
20
  - [Consuming messages](#consuming-messages)
21
21
  - [Declarative: @SubscribeTo()](#declarative-subscribeto)
22
22
  - [Imperative: startConsumer()](#imperative-startconsumer)
23
+ - [Iterator: consume()](#iterator-consume)
23
24
  - [Multiple consumer groups](#multiple-consumer-groups)
24
25
  - [Partition key](#partition-key)
25
26
  - [Message headers](#message-headers)
@@ -33,6 +34,12 @@ Type-safe Kafka client for Node.js. Framework-agnostic core with a first-class N
33
34
  - [Deduplication (Lamport Clock)](#deduplication-lamport-clock)
34
35
  - [Retry topic chain](#retry-topic-chain)
35
36
  - [stopConsumer](#stopconsumer)
37
+ - [Pause and resume](#pause-and-resume)
38
+ - [Circuit breaker](#circuit-breaker)
39
+ - [Reset consumer offsets](#reset-consumer-offsets)
40
+ - [Seek to offset](#seek-to-offset)
41
+ - [Message TTL](#message-ttl)
42
+ - [DLQ replay](#dlq-replay)
36
43
  - [Graceful shutdown](#graceful-shutdown)
37
44
  - [Consumer handles](#consumer-handles)
38
45
  - [onMessageLost](#onmessagelost)
@@ -83,6 +90,10 @@ Safe by default. Configurable when you need it. Escape hatches for when you know
83
90
  - **Health check** — built-in health indicator for monitoring
84
91
  - **Multiple consumer groups** — named clients for different bounded contexts
85
92
  - **Declarative & imperative** — use `@SubscribeTo()` decorator or `startConsumer()` directly
93
+ - **Async iterator** — `consume<K>()` returns an `AsyncIterableIterator<EventEnvelope<T[K]>>` for `for await` consumption; breaking out of the loop stops the consumer automatically
94
+ - **Message TTL** — `messageTtlMs` drops or DLQs messages older than a configurable threshold, preventing stale events from poisoning downstream systems after a lag spike
95
+ - **Circuit breaker** — `circuitBreaker` option applies a sliding-window breaker per topic-partition; pauses delivery on repeated DLQ failures and resumes after a configurable recovery window
96
+ - **Seek to offset** — `seekToOffset(groupId, assignments)` seeks individual partitions to explicit offsets for fine-grained replay
86
97
 
87
98
  See the [Roadmap](./ROADMAP.md) for upcoming features and version history.
88
99
 
@@ -376,7 +387,7 @@ export class OrdersService {
376
387
 
377
388
  ## Consuming messages
378
389
 
379
- Two ways — choose what fits your style.
390
+ Three ways — choose what fits your style.
380
391
 
381
392
  ### Declarative: @SubscribeTo()
382
393
 
@@ -425,6 +436,35 @@ export class OrdersService implements OnModuleInit {
425
436
  }
426
437
  ```
427
438
 
439
+ ### Iterator: consume()
440
+
441
+ Stream messages from a single topic as an `AsyncIterableIterator` — useful for scripts, one-off tasks, or any context where you prefer `for await` over a callback:
442
+
443
+ ```typescript
444
+ for await (const envelope of kafka.consume('order.created')) {
445
+ console.log('Order:', envelope.payload.orderId);
446
+ }
447
+
448
+ // Breaking out of the loop stops the consumer automatically
449
+ for await (const envelope of kafka.consume('order.created')) {
450
+ if (envelope.payload.orderId === targetId) break;
451
+ }
452
+ ```
453
+
454
+ `consume()` accepts the same `ConsumerOptions` as `startConsumer()`:
455
+
456
+ ```typescript
457
+ for await (const envelope of kafka.consume('orders', {
458
+ retry: { maxRetries: 3 },
459
+ dlq: true,
460
+ messageTtlMs: 60_000,
461
+ })) {
462
+ await processOrder(envelope.payload);
463
+ }
464
+ ```
465
+
466
+ `break`, `return`, or any early exit from the loop calls the iterator's `return()` method, which closes the internal queue and calls `handle.stop()` on the background consumer.
467
+
428
468
  ## Multiple consumer groups
429
469
 
430
470
  ### Per-consumer groupId
@@ -776,12 +816,20 @@ const myInstrumentation: KafkaInstrumentation = {
776
816
  `KafkaClient` maintains lightweight in-process event counters independently of any instrumentation:
777
817
 
778
818
  ```typescript
819
+ // Global snapshot — aggregate across all topics
779
820
  const snapshot = kafka.getMetrics();
780
821
  // { processedCount: number; retryCount: number; dlqCount: number; dedupCount: number }
781
822
 
782
- kafka.resetMetrics(); // reset all counters to zero
823
+ // Per-topic snapshot
824
+ const orderMetrics = kafka.getMetrics('order.created');
825
+ // { processedCount: 5, retryCount: 1, dlqCount: 0, dedupCount: 0 }
826
+
827
+ kafka.resetMetrics(); // reset all counters
828
+ kafka.resetMetrics('order.created'); // reset only one topic's counters
783
829
  ```
784
830
 
831
+ Passing a topic name that has not seen any events returns a zero-valued snapshot — it never throws.
832
+
785
833
  Counters are incremented in the same code paths that fire the corresponding hooks — they are always active regardless of whether any instrumentation is configured.
786
834
 
787
835
  ## Options reference
@@ -817,6 +865,12 @@ Options for `sendMessage()` — the third argument:
817
865
  | `handlerTimeoutMs` | — | Log a warning if the handler hasn't resolved within this window (ms) — does not cancel the handler |
818
866
  | `deduplication.strategy` | `'drop'` | What to do with duplicate messages: `'drop'` silently discards, `'dlq'` forwards to `{topic}.dlq` (requires `dlq: true`), `'topic'` forwards to `{topic}.duplicates` |
819
867
  | `deduplication.duplicatesTopic` | `{topic}.duplicates` | Custom destination for `strategy: 'topic'` |
868
+ | `messageTtlMs` | — | Drop (or DLQ) messages older than this many milliseconds at consumption time; evaluated against the `x-timestamp` header; see [Message TTL](#message-ttl) |
869
+ | `circuitBreaker` | — | Enable circuit breaker with `{}` for zero-config defaults; requires `dlq: true`; see [Circuit breaker](#circuit-breaker) |
870
+ | `circuitBreaker.threshold` | `5` | DLQ failures within `windowSize` that opens the circuit |
871
+ | `circuitBreaker.recoveryMs` | `30_000` | Milliseconds to wait in OPEN state before entering HALF_OPEN |
872
+ | `circuitBreaker.windowSize` | `threshold × 2, min 10` | Sliding window size in messages |
873
+ | `circuitBreaker.halfOpenSuccesses` | `1` | Consecutive successes in HALF_OPEN required to close the circuit |
820
874
  | `batch` | `false` | (decorator only) Use `startBatchConsumer` instead of `startConsumer` |
821
875
  | `subscribeRetry.retries` | `5` | Max attempts for `consumer.subscribe()` when topic doesn't exist yet |
822
876
  | `subscribeRetry.backoffMs` | `5000` | Delay between subscribe retry attempts (ms) |
@@ -1025,7 +1079,7 @@ By default, retry is handled in-process: the consumer sleeps between attempts wh
1025
1079
 
1026
1080
  Benefits over in-process retry:
1027
1081
 
1028
- - **Durable** — retry messages survive a consumer restart; routing between levels and to DLQ is exactly-once via Kafka transactions
1082
+ - **Durable** — retry messages survive a consumer restart; all routing (main retry.1, level N → N+1, retry → DLQ) is exactly-once via Kafka transactions
1029
1083
  - **Non-blocking** — the original consumer is free immediately; each level consumer only pauses its specific partition during the delay window, so other partitions continue processing
1030
1084
  - **Isolated** — each retry level has its own consumer group, so a slow level 3 consumer never blocks a level 1 consumer
1031
1085
 
@@ -1057,9 +1111,9 @@ Each level consumer uses `consumer.pause → sleep(remaining) → consumer.resum
1057
1111
 
1058
1112
  The retry topic messages carry scheduling headers (`x-retry-attempt`, `x-retry-after`, `x-retry-original-topic`, `x-retry-max-retries`) that each level consumer reads automatically — no manual configuration needed.
1059
1113
 
1060
- > **Delivery guarantee:** routing within the retry chain (retry.N → retry.N+1 and retry.N → DLQ) is **exactly-once** — each routing step is wrapped in a Kafka transaction via `sendOffsetsToTransaction`, so the produce and the consumer offset commit happen atomically. A crash at any point rolls back the transaction: the message is redelivered and the routing is retried, with no duplicate in the next level. If the EOS transaction itself fails (broker unavailable), the offset is not committed and the message stays safely in the retry topic until the broker recovers.
1114
+ > **Delivery guarantee:** the entire retry chain — including the **main consumer → retry.1** boundary — is **exactly-once**. Every routing step (main → retry.1, retry.N → retry.N+1, retry.N → DLQ) is wrapped in a Kafka transaction via `sendOffsetsToTransaction`: the produce and the consumer offset commit happen atomically. A crash at any point rolls back the transaction: the message is redelivered and the routing is retried, with no duplicate in the next level. If the EOS transaction fails (broker unavailable), the offset stays uncommitted and the message is safely redelivered it is never lost.
1061
1115
  >
1062
- > The remaining at-least-once window is at the **main consumer → retry.1** boundary: the main consumer uses `autoCommit: true` by default, so if it crashes after routing to `retry.1` but before autoCommit fires, the message may appear twice in `retry.1`. This is the standard Kafka at-least-once trade-off for any consumer using autoCommit. Design handlers to be idempotent if this edge case is unacceptable.
1116
+ > The standard Kafka at-least-once guarantee still applies at the handler level: if your handler succeeds but the process crashes before the manual offset commit completes, the message is redelivered to the handler. Design handlers to be idempotent.
1063
1117
  >
1064
1118
  > **Startup validation:** `retryTopics` requires `retry` to be set — an error is thrown at startup if `retry` is missing. When `autoCreateTopics: false`, all `{topic}.retry.N` topics are validated to exist at startup and a clear error lists any missing ones. With `autoCreateTopics: true` the check is skipped — topics are created automatically by the `ensureTopic` path. Supported by both `startConsumer` and `startBatchConsumer`.
1065
1119
 
@@ -1079,6 +1133,163 @@ await kafka.stopConsumer();
1079
1133
 
1080
1134
  `stopConsumer(groupId)` disconnects and removes only that group's consumer, leaving other groups running. Useful when you want to pause processing for a specific topic without restarting the whole client.
1081
1135
 
1136
+ ## Pause and resume
1137
+
1138
+ Temporarily stop delivering messages from specific partitions without disconnecting the consumer:
1139
+
1140
+ ```typescript
1141
+ // Pause partition 0 of 'orders' (default group)
1142
+ kafka.pauseConsumer(undefined, [{ topic: 'orders', partitions: [0] }]);
1143
+
1144
+ // Resume it later
1145
+ kafka.resumeConsumer(undefined, [{ topic: 'orders', partitions: [0] }]);
1146
+
1147
+ // Target a specific consumer group, multiple partitions
1148
+ kafka.pauseConsumer('payments-group', [{ topic: 'payments', partitions: [0, 1] }]);
1149
+ ```
1150
+
1151
+ The first argument is the consumer group ID — pass `undefined` to target the default group. A warning is logged if the group is not found.
1152
+
1153
+ Pausing is non-destructive: the consumer stays connected and Kafka preserves the partition assignment for as long as the group session is alive. Messages accumulate in the topic and are delivered once the consumer resumes. Typical use: apply backpressure when a downstream dependency (e.g. a database) is temporarily overloaded.
1154
+
1155
+ ## Circuit breaker
1156
+
1157
+ Automatically pause delivery from a topic-partition when its DLQ error rate exceeds a threshold. After a recovery window the partition is resumed automatically.
1158
+
1159
+ **`dlq: true` is required** — the breaker counts DLQ events as failures. Without it no failures are recorded and the circuit never opens.
1160
+
1161
+ Zero-config start — all options have sensible defaults:
1162
+
1163
+ ```typescript
1164
+ await kafka.startConsumer(['orders'], handler, {
1165
+ dlq: true,
1166
+ circuitBreaker: {},
1167
+ });
1168
+ ```
1169
+
1170
+ Full config for fine-tuning:
1171
+
1172
+ ```typescript
1173
+ await kafka.startConsumer(['orders'], handler, {
1174
+ dlq: true,
1175
+ circuitBreaker: {
1176
+ threshold: 10, // open after 10 failures (default: 5)
1177
+ recoveryMs: 60_000, // wait 60 s before probing (default: 30 s)
1178
+ windowSize: 50, // track last 50 messages (default: threshold × 2, min 10)
1179
+ halfOpenSuccesses: 3, // 3 successes to close (default: 1)
1180
+ },
1181
+ });
1182
+ ```
1183
+
1184
+ State machine per `${groupId}:${topic}:${partition}`:
1185
+
1186
+ | State | Behaviour |
1187
+ | ----- | --------- |
1188
+ | **CLOSED** (normal) | Messages delivered. Failures recorded in sliding window. Opens when `failures ≥ threshold`. |
1189
+ | **OPEN** | Partition paused via `pauseConsumer`. After `recoveryMs` ms transitions to HALF_OPEN. |
1190
+ | **HALF_OPEN** | Partition resumed. After `halfOpenSuccesses` consecutive successes the circuit closes. Any single failure immediately re-opens it. |
1191
+
1192
+ Successful `onMessage` completions count as successes. The retry topic path is not subject to the breaker — it has its own backoff and EOS guarantees.
1193
+
1194
+ Options:
1195
+
1196
+ | Option | Default | Description |
1197
+ | ------ | ------- | ----------- |
1198
+ | `threshold` | `5` | DLQ failures within `windowSize` that opens the circuit |
1199
+ | `recoveryMs` | `30_000` | Milliseconds to wait in OPEN state before entering HALF_OPEN |
1200
+ | `windowSize` | `threshold × 2, min 10` | Sliding window size in messages |
1201
+ | `halfOpenSuccesses` | `1` | Consecutive successes in HALF_OPEN required to close the circuit |
1202
+
1203
+ ## Reset consumer offsets
1204
+
1205
+ Seek a consumer group's committed offsets to the beginning or end of a topic:
1206
+
1207
+ ```typescript
1208
+ // Seek to the beginning — re-process all existing messages
1209
+ await kafka.resetOffsets(undefined, 'orders', 'earliest');
1210
+
1211
+ // Seek to the end — skip existing messages, process only new ones
1212
+ await kafka.resetOffsets(undefined, 'orders', 'latest');
1213
+
1214
+ // Target a specific consumer group
1215
+ await kafka.resetOffsets('payments-group', 'orders', 'earliest');
1216
+ ```
1217
+
1218
+ **Important:** the consumer for the specified group must be stopped before calling `resetOffsets`. An error is thrown if the group is currently running — this prevents the reset from racing with an active offset commit.
1219
+
1220
+ ## Seek to offset
1221
+
1222
+ Seek individual topic-partitions to explicit offsets — useful when `resetOffsets` is too coarse and you need per-partition control:
1223
+
1224
+ ```typescript
1225
+ // Seek partition 0 of 'orders' to offset 100, partition 1 to offset 200
1226
+ await kafka.seekToOffset(undefined, [
1227
+ { topic: 'orders', partition: 0, offset: '100' },
1228
+ { topic: 'orders', partition: 1, offset: '200' },
1229
+ ]);
1230
+
1231
+ // Multiple topics in one call
1232
+ await kafka.seekToOffset('payments-group', [
1233
+ { topic: 'payments', partition: 0, offset: '0' },
1234
+ { topic: 'refunds', partition: 0, offset: '500' },
1235
+ ]);
1236
+ ```
1237
+
1238
+ The first argument is the consumer group ID — pass `undefined` to target the default group. Assignments are grouped by topic internally so each `admin.setOffsets` call covers all partitions of one topic.
1239
+
1240
+ **Important:** the consumer for the specified group must be stopped before calling `seekToOffset`. An error is thrown if the group is currently running.
1241
+
1242
+ ## Message TTL
1243
+
1244
+ Drop or route expired messages using `messageTtlMs` in `ConsumerOptions`:
1245
+
1246
+ ```typescript
1247
+ await kafka.startConsumer(['orders'], handler, {
1248
+ messageTtlMs: 60_000, // drop messages older than 60 s
1249
+ dlq: true, // route expired messages to DLQ instead of dropping
1250
+ });
1251
+ ```
1252
+
1253
+ The TTL is evaluated against the `x-timestamp` header stamped on every outgoing message by the producer. Messages whose age at consumption time exceeds `messageTtlMs` are:
1254
+
1255
+ - **Routed to DLQ** with `x-dlq-reason: ttl-expired` when `dlq: true`
1256
+ - **Dropped** (calling `onMessageLost`) otherwise
1257
+
1258
+ Typical use: prevent stale events from poisoning downstream systems after a consumer lag spike — e.g. discard order events or push notifications that are no longer actionable.
1259
+
1260
+ ## DLQ replay
1261
+
1262
+ Re-publish messages from a dead letter queue back to the original topic:
1263
+
1264
+ ```typescript
1265
+ // Re-publish all messages from 'orders.dlq' → 'orders'
1266
+ const result = await kafka.replayDlq('orders');
1267
+ // { replayed: 42, skipped: 0 }
1268
+ ```
1269
+
1270
+ Options:
1271
+
1272
+ | Option | Default | Description |
1273
+ | ------ | ------- | ----------- |
1274
+ | `targetTopic` | `x-dlq-original-topic` header | Override the destination topic |
1275
+ | `dryRun` | `false` | Count messages without sending |
1276
+ | `filter` | — | `(headers) => boolean` — skip messages where the callback returns `false` |
1277
+
1278
+ ```typescript
1279
+ // Dry run — see how many messages would be replayed
1280
+ const dry = await kafka.replayDlq('orders', { dryRun: true });
1281
+
1282
+ // Route to a different topic
1283
+ const result = await kafka.replayDlq('orders', { targetTopic: 'orders.v2' });
1284
+
1285
+ // Only replay messages with a specific correlation ID
1286
+ const filtered = await kafka.replayDlq('orders', {
1287
+ filter: (headers) => headers['x-correlation-id'] === 'corr-123',
1288
+ });
1289
+ ```
1290
+
1291
+ `replayDlq` creates a temporary consumer group that reads the DLQ topic up to the high-watermark at the time of the call — messages published after replay starts are not included. DLQ metadata headers (`x-dlq-original-topic`, `x-dlq-error-message`, `x-dlq-error-stack`, `x-dlq-failed-at`, `x-dlq-attempt-count`) are stripped from the replayed messages; all other headers (e.g. `x-correlation-id`) are preserved.
1292
+
1082
1293
  ## Graceful shutdown
1083
1294
 
1084
1295
  `disconnect()` now drains in-flight handlers before tearing down connections — no messages are silently cut off mid-processing.