@drarzter/kafka-client 0.9.4 → 0.11.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +693 -8
- package/dist/chunk-OR7TPAAE.mjs +4760 -0
- package/dist/chunk-OR7TPAAE.mjs.map +1 -0
- package/dist/chunk-PQVBRDNV.mjs +149 -0
- package/dist/chunk-PQVBRDNV.mjs.map +1 -0
- package/dist/cli/dlq.d.ts +119 -0
- package/dist/cli/dlq.d.ts.map +1 -0
- package/dist/cli/index.d.ts +3 -0
- package/dist/cli/index.d.ts.map +1 -0
- package/dist/{chunk-SM4FZKAZ.mjs → cli/index.js} +1073 -309
- package/dist/cli/index.js.map +1 -0
- package/dist/cli/index.mjs +356 -0
- package/dist/cli/index.mjs.map +1 -0
- package/dist/client/config/from-env.d.ts +188 -0
- package/dist/client/config/from-env.d.ts.map +1 -0
- package/dist/client/config/index.d.ts +2 -0
- package/dist/client/config/index.d.ts.map +1 -0
- package/dist/client/errors.d.ts +67 -0
- package/dist/client/errors.d.ts.map +1 -0
- package/dist/client/kafka.client/admin/ops.d.ts +114 -0
- package/dist/client/kafka.client/admin/ops.d.ts.map +1 -0
- package/dist/client/kafka.client/consumer/features/delayed.d.ts +24 -0
- package/dist/client/kafka.client/consumer/features/delayed.d.ts.map +1 -0
- package/dist/client/kafka.client/consumer/features/dlq-replay.d.ts +52 -0
- package/dist/client/kafka.client/consumer/features/dlq-replay.d.ts.map +1 -0
- package/dist/client/kafka.client/consumer/features/routed.d.ts +4 -0
- package/dist/client/kafka.client/consumer/features/routed.d.ts.map +1 -0
- package/dist/client/kafka.client/consumer/features/snapshot.d.ts +10 -0
- package/dist/client/kafka.client/consumer/features/snapshot.d.ts.map +1 -0
- package/dist/client/kafka.client/consumer/features/window.d.ts +5 -0
- package/dist/client/kafka.client/consumer/features/window.d.ts.map +1 -0
- package/dist/client/kafka.client/consumer/handler.d.ts +163 -0
- package/dist/client/kafka.client/consumer/handler.d.ts.map +1 -0
- package/dist/client/kafka.client/consumer/ops.d.ts +64 -0
- package/dist/client/kafka.client/consumer/ops.d.ts.map +1 -0
- package/dist/client/kafka.client/consumer/pipeline.d.ts +168 -0
- package/dist/client/kafka.client/consumer/pipeline.d.ts.map +1 -0
- package/dist/client/kafka.client/consumer/queue.d.ts +37 -0
- package/dist/client/kafka.client/consumer/queue.d.ts.map +1 -0
- package/dist/client/kafka.client/consumer/retry-topic.d.ts +68 -0
- package/dist/client/kafka.client/consumer/retry-topic.d.ts.map +1 -0
- package/dist/client/kafka.client/consumer/setup.d.ts +66 -0
- package/dist/client/kafka.client/consumer/setup.d.ts.map +1 -0
- package/dist/client/kafka.client/consumer/start.d.ts +7 -0
- package/dist/client/kafka.client/consumer/start.d.ts.map +1 -0
- package/dist/client/kafka.client/consumer/stop.d.ts +19 -0
- package/dist/client/kafka.client/consumer/stop.d.ts.map +1 -0
- package/dist/client/kafka.client/consumer/subscribe-retry.d.ts +4 -0
- package/dist/client/kafka.client/consumer/subscribe-retry.d.ts.map +1 -0
- package/dist/client/kafka.client/context.d.ts +75 -0
- package/dist/client/kafka.client/context.d.ts.map +1 -0
- package/dist/client/kafka.client/index.d.ts +155 -0
- package/dist/client/kafka.client/index.d.ts.map +1 -0
- package/dist/client/kafka.client/infra/circuit-breaker.manager.d.ts +61 -0
- package/dist/client/kafka.client/infra/circuit-breaker.manager.d.ts.map +1 -0
- package/dist/client/kafka.client/infra/dedup.store.d.ts +28 -0
- package/dist/client/kafka.client/infra/dedup.store.d.ts.map +1 -0
- package/dist/client/kafka.client/infra/inflight.tracker.d.ts +22 -0
- package/dist/client/kafka.client/infra/inflight.tracker.d.ts.map +1 -0
- package/dist/client/kafka.client/infra/metrics.manager.d.ts +67 -0
- package/dist/client/kafka.client/infra/metrics.manager.d.ts.map +1 -0
- package/dist/client/kafka.client/producer/lifecycle.d.ts +41 -0
- package/dist/client/kafka.client/producer/lifecycle.d.ts.map +1 -0
- package/dist/client/kafka.client/producer/ops.d.ts +79 -0
- package/dist/client/kafka.client/producer/ops.d.ts.map +1 -0
- package/dist/client/kafka.client/producer/send.d.ts +21 -0
- package/dist/client/kafka.client/producer/send.d.ts.map +1 -0
- package/dist/client/kafka.client/validate-options.d.ts +11 -0
- package/dist/client/kafka.client/validate-options.d.ts.map +1 -0
- package/dist/client/message/envelope.d.ts +105 -0
- package/dist/client/message/envelope.d.ts.map +1 -0
- package/dist/client/message/schema-registry.d.ts +124 -0
- package/dist/client/message/schema-registry.d.ts.map +1 -0
- package/dist/client/message/serde.d.ts +68 -0
- package/dist/client/message/serde.d.ts.map +1 -0
- package/dist/client/message/topic.d.ts +159 -0
- package/dist/client/message/topic.d.ts.map +1 -0
- package/dist/client/message/versioned-schema.d.ts +53 -0
- package/dist/client/message/versioned-schema.d.ts.map +1 -0
- package/dist/client/outbox/index.d.ts +4 -0
- package/dist/client/outbox/index.d.ts.map +1 -0
- package/dist/client/outbox/outbox.relay.d.ts +90 -0
- package/dist/client/outbox/outbox.relay.d.ts.map +1 -0
- package/dist/client/outbox/outbox.store.d.ts +42 -0
- package/dist/client/outbox/outbox.store.d.ts.map +1 -0
- package/dist/client/outbox/outbox.types.d.ts +144 -0
- package/dist/client/outbox/outbox.types.d.ts.map +1 -0
- package/dist/client/security/acl.d.ts +108 -0
- package/dist/client/security/acl.d.ts.map +1 -0
- package/dist/client/security/index.d.ts +5 -0
- package/dist/client/security/index.d.ts.map +1 -0
- package/dist/client/security/providers.d.ts +88 -0
- package/dist/client/security/providers.d.ts.map +1 -0
- package/dist/client/security/resolve-security.d.ts +19 -0
- package/dist/client/security/resolve-security.d.ts.map +1 -0
- package/dist/client/security/security.types.d.ts +76 -0
- package/dist/client/security/security.types.d.ts.map +1 -0
- package/dist/client/transport/confluent.transport.d.ts +32 -0
- package/dist/client/transport/confluent.transport.d.ts.map +1 -0
- package/dist/client/transport/transport.interface.d.ts +221 -0
- package/dist/client/transport/transport.interface.d.ts.map +1 -0
- package/dist/client/types/admin.interface.d.ts +174 -0
- package/dist/client/types/admin.interface.d.ts.map +1 -0
- package/dist/client/types/admin.types.d.ts +140 -0
- package/dist/client/types/admin.types.d.ts.map +1 -0
- package/dist/client/types/client.d.ts +21 -0
- package/dist/client/types/client.d.ts.map +1 -0
- package/dist/client/types/common.d.ts +84 -0
- package/dist/client/types/common.d.ts.map +1 -0
- package/dist/client/types/config.types.d.ts +167 -0
- package/dist/client/types/config.types.d.ts.map +1 -0
- package/dist/client/types/consumer.interface.d.ts +115 -0
- package/dist/client/types/consumer.interface.d.ts.map +1 -0
- package/dist/{consumer.types-fFCag3VJ.d.mts → client/types/consumer.types.d.ts} +62 -383
- package/dist/client/types/consumer.types.d.ts.map +1 -0
- package/dist/client/types/dedup.types.d.ts +50 -0
- package/dist/client/types/dedup.types.d.ts.map +1 -0
- package/dist/client/types/lifecycle.interface.d.ts +72 -0
- package/dist/client/types/lifecycle.interface.d.ts.map +1 -0
- package/dist/client/types/producer.interface.d.ts +52 -0
- package/dist/client/types/producer.interface.d.ts.map +1 -0
- package/dist/client/types/producer.types.d.ts +90 -0
- package/dist/client/types/producer.types.d.ts.map +1 -0
- package/dist/client/types.d.ts +8 -0
- package/dist/client/types.d.ts.map +1 -0
- package/dist/core.d.ts +13 -314
- package/dist/core.d.ts.map +1 -0
- package/dist/core.js +1466 -123
- package/dist/core.js.map +1 -1
- package/dist/core.mjs +45 -3
- package/dist/index.d.ts +7 -128
- package/dist/index.d.ts.map +1 -0
- package/dist/index.js +1483 -123
- package/dist/index.js.map +1 -1
- package/dist/index.mjs +62 -3
- package/dist/index.mjs.map +1 -1
- package/dist/nest/kafka.constants.d.ts +5 -0
- package/dist/nest/kafka.constants.d.ts.map +1 -0
- package/dist/nest/kafka.decorator.d.ts +49 -0
- package/dist/nest/kafka.decorator.d.ts.map +1 -0
- package/dist/nest/kafka.explorer.d.ts +17 -0
- package/dist/nest/kafka.explorer.d.ts.map +1 -0
- package/dist/nest/kafka.health.d.ts +7 -0
- package/dist/nest/kafka.health.d.ts.map +1 -0
- package/dist/nest/kafka.module.d.ts +61 -0
- package/dist/nest/kafka.module.d.ts.map +1 -0
- package/dist/otel.d.ts +83 -5
- package/dist/otel.d.ts.map +1 -0
- package/dist/otel.js +100 -6
- package/dist/otel.js.map +1 -1
- package/dist/otel.mjs +98 -5
- package/dist/otel.mjs.map +1 -1
- package/dist/serde.d.ts +157 -0
- package/dist/serde.d.ts.map +1 -0
- package/dist/serde.js +308 -0
- package/dist/serde.js.map +1 -0
- package/dist/serde.mjs +158 -0
- package/dist/serde.mjs.map +1 -0
- package/dist/testing/client.mock.d.ts +47 -0
- package/dist/testing/client.mock.d.ts.map +1 -0
- package/dist/testing/index.d.ts +4 -0
- package/dist/testing/index.d.ts.map +1 -0
- package/dist/testing/test.container.d.ts +63 -0
- package/dist/testing/test.container.d.ts.map +1 -0
- package/dist/{testing.d.mts → testing/transport.fake.d.ts} +7 -111
- package/dist/testing/transport.fake.d.ts.map +1 -0
- package/dist/testing.d.ts +2 -318
- package/dist/testing.d.ts.map +1 -0
- package/dist/testing.js +26 -0
- package/dist/testing.js.map +1 -1
- package/dist/testing.mjs +26 -0
- package/dist/testing.mjs.map +1 -1
- package/package.json +40 -8
- package/dist/chunk-SM4FZKAZ.mjs.map +0 -1
- package/dist/client-1irhGEu0.d.mts +0 -751
- package/dist/client-BpFjkHhr.d.ts +0 -751
- package/dist/consumer.types-fFCag3VJ.d.ts +0 -958
- package/dist/core.d.mts +0 -314
- package/dist/index.d.mts +0 -128
- package/dist/otel.d.mts +0 -27
package/README.md
CHANGED
|
@@ -24,17 +24,25 @@ Type-safe Kafka client for Node.js. Framework-agnostic core with a first-class N
|
|
|
24
24
|
- [Iterator: consume()](#iterator-consume)
|
|
25
25
|
- [Multiple consumer groups](#multiple-consumer-groups)
|
|
26
26
|
- [Partition key](#partition-key)
|
|
27
|
+
- [Typed partition keys](#typed-partition-keys)
|
|
27
28
|
- [Message headers](#message-headers)
|
|
28
29
|
- [Batch sending](#batch-sending)
|
|
30
|
+
- [Delayed delivery](#delayed-delivery)
|
|
29
31
|
- [Batch consuming](#batch-consuming)
|
|
30
32
|
- [Tombstone messages](#tombstone-messages)
|
|
31
33
|
- [Compression](#compression)
|
|
32
34
|
- [Transactions](#transactions)
|
|
33
35
|
- [Consumer interceptors](#consumer-interceptors)
|
|
34
36
|
- [Instrumentation](#instrumentation)
|
|
37
|
+
- [OpenTelemetry metrics](#opentelemetry-metrics)
|
|
38
|
+
- [Transport security](#transport-security)
|
|
39
|
+
- [AWS MSK IAM & GCP authentication](#aws-msk-iam--gcp-authentication)
|
|
40
|
+
- [ACL requirements](#acl-requirements)
|
|
41
|
+
- [Environment configuration](#environment-configuration)
|
|
35
42
|
- [Options reference](#options-reference)
|
|
36
43
|
- [Error classes](#error-classes)
|
|
37
44
|
- [Deduplication (Lamport Clock)](#deduplication-lamport-clock)
|
|
45
|
+
- [Pluggable deduplication store](#pluggable-deduplication-store)
|
|
38
46
|
- [Retry topic chain](#retry-topic-chain)
|
|
39
47
|
- [stopConsumer](#stopconsumer)
|
|
40
48
|
- [Pause and resume](#pause-and-resume)
|
|
@@ -53,7 +61,11 @@ Type-safe Kafka client for Node.js. Framework-agnostic core with a first-class N
|
|
|
53
61
|
- [Header-based routing](#header-based-routing)
|
|
54
62
|
- [Lag-based producer throttling](#lag-based-producer-throttling)
|
|
55
63
|
- [Transactional consumer](#transactional-consumer)
|
|
64
|
+
- [Transactional outbox](#transactional-outbox)
|
|
65
|
+
- [Serialization: JSON, Avro, Protobuf](#serialization-json-avro-protobuf)
|
|
66
|
+
- [Schema Registry client](#schema-registry-client)
|
|
56
67
|
- [Admin API](#admin-api)
|
|
68
|
+
- [DLQ CLI](#dlq-cli)
|
|
57
69
|
- [Graceful shutdown](#graceful-shutdown)
|
|
58
70
|
- [Consumer handles](#consumer-handles)
|
|
59
71
|
- [onMessageLost](#onmessagelost)
|
|
@@ -61,8 +73,11 @@ Type-safe Kafka client for Node.js. Framework-agnostic core with a first-class N
|
|
|
61
73
|
- [onRebalance](#onrebalance)
|
|
62
74
|
- [Consumer lag](#consumer-lag)
|
|
63
75
|
- [Handler timeout warning](#handler-timeout-warning)
|
|
76
|
+
- [Static group membership](#static-group-membership)
|
|
64
77
|
- [Schema validation](#schema-validation)
|
|
78
|
+
- [Versioned schemas](#versioned-schemas)
|
|
65
79
|
- [Context-aware validators](#context-aware-validators-schemaparsecontext)
|
|
80
|
+
- [Constructor options validation](#constructor-options-validation)
|
|
66
81
|
- [Health check](#health-check)
|
|
67
82
|
- [Testing](#testing)
|
|
68
83
|
- [Project structure](#project-structure)
|
|
@@ -107,13 +122,28 @@ Safe by default. Configurable when you need it. Escape hatches for when you know
|
|
|
107
122
|
- **Declarative & imperative** — use `@SubscribeTo()` decorator or `startConsumer()` directly
|
|
108
123
|
- **Async iterator** — `consume<K>()` returns an `AsyncIterableIterator<EventEnvelope<T[K]>>` for `for await` consumption; breaking out of the loop stops the consumer automatically
|
|
109
124
|
- **Message TTL** — `messageTtlMs` drops or DLQs messages older than a configurable threshold, preventing stale events from poisoning downstream systems after a lag spike
|
|
110
|
-
- **Circuit breaker** — `circuitBreaker` option applies a sliding-window breaker per topic-partition; pauses delivery on repeated
|
|
125
|
+
- **Circuit breaker** — `circuitBreaker` option applies a sliding-window breaker per topic-partition; pauses delivery on repeated handler failures and resumes after a configurable recovery window
|
|
111
126
|
- **Seek to offset** — `seekToOffset(groupId, assignments)` seeks individual partitions to explicit offsets for fine-grained replay
|
|
112
127
|
- **Tombstone messages** — `sendTombstone(topic, key)` sends a null-value record to compact a key out of a log-compacted topic; all instrumentation hooks still fire
|
|
113
128
|
- **Regex topic subscription** — `startConsumer([/^orders\..+/], handler)` subscribes using a pattern; the broker routes matching topics to the consumer dynamically
|
|
114
129
|
- **Compression** — per-send `compression` option (`gzip`, `snappy`, `lz4`, `zstd`) in `SendOptions` and `BatchSendOptions`
|
|
115
130
|
- **Partition assignment strategy** — `partitionAssigner` in `ConsumerOptions` chooses between `cooperative-sticky` (default), `roundrobin`, and `range`
|
|
116
131
|
- **Admin API** — `listConsumerGroups()`, `describeTopics()`, `deleteRecords()` for group inspection, partition metadata, and message deletion
|
|
132
|
+
- **Typed partition keys** — `topic('orders').type<T>().key(m => m.orderId)` binds a partition-key extractor to a descriptor so related messages land on the same partition without passing `key` at every call site
|
|
133
|
+
- **Versioned schemas** — `versionedSchema({ 1: v1, 2: v2 }, { migrate })` dispatches validation on the `x-schema-version` header and upgrades old shapes to the latest
|
|
134
|
+
- **Constructor validation** — the `KafkaClient` constructor fails fast, throwing a single aggregated error that lists every invalid config value instead of surfacing a confusing driver error on first use
|
|
135
|
+
- **Pluggable deduplication store** — swap the in-memory Lamport-clock store for a `DedupStore` (e.g. Redis-backed) so deduplication survives restarts and rebalances; fail-open on store errors
|
|
136
|
+
- **Delayed delivery** — `sendMessage(..., { deliverAfterMs })` stages messages in `<topic>.delayed`; a `startDelayedRelay()` consumer forwards them transactionally once the deadline passes
|
|
137
|
+
- **OpenTelemetry metrics** — `otelMetricsInstrumentation()` records send/consume counters and a handler-duration histogram; `otelLagGauge()` reports per-partition consumer lag as an observable gauge
|
|
138
|
+
- **Transport security** — `security: { ssl, sasl }` with secure-by-default rules: SASL auto-enables TLS, plaintext to non-local brokers warns once (silenceable via `allowInsecure: true`); SASL mechanisms `plain`, `scram-sha-256`, `scram-sha-512`, `oauthbearer`
|
|
139
|
+
- **AWS MSK / GCP auth** — `awsMskIamProvider({ region })` and `gcpAccessTokenProvider()` supply OAUTHBEARER tokens from the standard AWS / Google credential chains (IRSA, task roles, ADC)
|
|
140
|
+
- **ACL requirements helper** — `describeRequiredAcls()` enumerates every derived topic, companion group, ephemeral group, and transactional id a service needs; render them as `kafka-acls.sh` commands or an MSK IAM policy
|
|
141
|
+
- **Environment configuration** — `kafkaClientConfigFromEnv()`, `consumerOptionsFromEnv()`, and `mergeConsumerOptions()` build config from env vars with `code > env > defaults` precedence
|
|
142
|
+
- **Transactional outbox** — `startOutboxRelay()` publishes rows from a DB outbox table to Kafka inside a transaction; at-least-once with stable `eventId` for downstream dedup
|
|
143
|
+
- **Pluggable serialization** — JSON by default; drop in `avroSerde()` / `protobufSerde()` (`@drarzter/kafka-client/serde`) for **Confluent wire-format** Avro/Protobuf and interop with Java/Go via a Schema Registry, client-wide or per-topic
|
|
144
|
+
- **Schema Registry client** — `SchemaRegistryClient` + `registrySchema()` keep locally-defined schemas in lockstep with a Confluent-compatible registry
|
|
145
|
+
- **Static group membership** — `groupInstanceId` (`group.instance.id`) skips rebalance on k8s rolling restarts within `session.timeout.ms`
|
|
146
|
+
- **DLQ CLI** — `kafka-client-dlq ls | peek | replay` for inspecting and re-publishing dead letter queues from the terminal
|
|
117
147
|
|
|
118
148
|
See the [Roadmap](./ROADMAP.md) for upcoming features and version history.
|
|
119
149
|
|
|
@@ -605,6 +635,36 @@ await this.kafka.sendMessage(
|
|
|
605
635
|
);
|
|
606
636
|
```
|
|
607
637
|
|
|
638
|
+
### Typed partition keys
|
|
639
|
+
|
|
640
|
+
Instead of passing `key` at every call site, bind a partition-key extractor to the topic descriptor with `.key()`. The extractor runs on every send through that descriptor, so messages with the same logical key always land on the same partition — you never forget to set it. Available on both `.type<T>()` and `.schema()` descriptors:
|
|
641
|
+
|
|
642
|
+
```typescript
|
|
643
|
+
import { topic } from '@drarzter/kafka-client';
|
|
644
|
+
|
|
645
|
+
const OrderCreated = topic('order.created')
|
|
646
|
+
.type<{ orderId: string; userId: string; amount: number }>()
|
|
647
|
+
.key((m) => m.orderId);
|
|
648
|
+
|
|
649
|
+
// Key is derived automatically from the payload — no `key` needed
|
|
650
|
+
await kafka.sendMessage(OrderCreated, { orderId: '123', userId: '456', amount: 100 });
|
|
651
|
+
// → produced with key '123'
|
|
652
|
+
|
|
653
|
+
// Works with schema descriptors too
|
|
654
|
+
const PaymentTaken = topic('payment.taken')
|
|
655
|
+
.schema(z.object({ paymentId: z.string(), orderId: z.string() }))
|
|
656
|
+
.key((m) => m.orderId);
|
|
657
|
+
```
|
|
658
|
+
|
|
659
|
+
The extractor runs on the **original (pre-validation) payload**. An explicit `key` in `SendOptions` — or a batch item's `key` — always wins over the descriptor's extractor:
|
|
660
|
+
|
|
661
|
+
```typescript
|
|
662
|
+
// Explicit key overrides the extractor
|
|
663
|
+
await kafka.sendMessage(OrderCreated, { orderId: '123', userId: '456', amount: 100 }, {
|
|
664
|
+
key: 'custom-partition-key',
|
|
665
|
+
});
|
|
666
|
+
```
|
|
667
|
+
|
|
608
668
|
## Message headers
|
|
609
669
|
|
|
610
670
|
Attach metadata to messages:
|
|
@@ -642,6 +702,33 @@ await this.kafka.sendBatch('order.created', [
|
|
|
642
702
|
]);
|
|
643
703
|
```
|
|
644
704
|
|
|
705
|
+
## Delayed delivery
|
|
706
|
+
|
|
707
|
+
Schedule a message for future delivery with `deliverAfterMs`. Instead of going straight to the target topic, the message is produced to a `<topic>.delayed` staging topic carrying `x-delayed-until` (deadline) and `x-delayed-target` headers. A **relay consumer** started via `startDelayedRelay()` holds each message until its deadline passes, then forwards it to the target topic:
|
|
708
|
+
|
|
709
|
+
```typescript
|
|
710
|
+
// 1. Start the relay once (per process) for the topics you delay-deliver to
|
|
711
|
+
await kafka.startDelayedRelay(['order.reminder']);
|
|
712
|
+
|
|
713
|
+
// 2. Send a message that should arrive in ~1 hour
|
|
714
|
+
await kafka.sendMessage(
|
|
715
|
+
'order.reminder',
|
|
716
|
+
{ orderId: '123', channel: 'email' },
|
|
717
|
+
{ deliverAfterMs: 60 * 60 * 1000 },
|
|
718
|
+
);
|
|
719
|
+
// → staged in order.reminder.delayed, forwarded to order.reminder ~1 h later
|
|
720
|
+
```
|
|
721
|
+
|
|
722
|
+
`deliverAfterMs` also works on `sendBatch` — it applies to the whole batch:
|
|
723
|
+
|
|
724
|
+
```typescript
|
|
725
|
+
await kafka.sendBatch('order.reminder', messages, { deliverAfterMs: 30_000 });
|
|
726
|
+
```
|
|
727
|
+
|
|
728
|
+
The relay defaults to a `<defaultGroupId>-delayed-relay` consumer group; override it with `startDelayedRelay(topics, { groupId })`. Forwarding is **transactional** — the produce to the target topic and the source-offset commit happen atomically, so no duplicates are relayed even if the relay crashes mid-forward. The original key, value, and envelope headers (`x-event-id`, `x-correlation-id`, `x-lamport-clock`, `traceparent`) all survive the hop; only the `x-delayed-*` control headers are stripped.
|
|
729
|
+
|
|
730
|
+
> **Delivery time is a lower bound.** The relay pauses a partition until the head-of-line message's deadline, so later messages on the same partition wait behind it (at-least semantics). Delayed messages are only delivered while the relay is running — treat it as a long-lived consumer, not a fire-and-forget scheduler.
|
|
731
|
+
|
|
645
732
|
## Batch consuming
|
|
646
733
|
|
|
647
734
|
Process messages in batches for higher throughput. The handler receives an array of `EventEnvelope`s and a `BatchMeta` object with offset management controls:
|
|
@@ -821,6 +908,47 @@ const kafka = new KafkaClient('my-app', 'my-group', brokers, {
|
|
|
821
908
|
|
|
822
909
|
`otelInstrumentation()` injects `traceparent` on send, extracts it on consume, and creates `CONSUMER` spans automatically. The span is set as the **active OTel context** for the handler's duration via `context.with()` — so `trace.getActiveSpan()` works inside your handler and any child spans are automatically parented to the consume span. Requires `@opentelemetry/api` as a peer dependency.
|
|
823
910
|
|
|
911
|
+
### OpenTelemetry metrics
|
|
912
|
+
|
|
913
|
+
`otelInstrumentation()` handles **traces**. For **metrics**, the same entrypoint exports `otelMetricsInstrumentation()` (counters + a duration histogram) and `otelLagGauge()` (an observable consumer-lag gauge). They share nothing with the tracing instrumentation and compose with it in any order:
|
|
914
|
+
|
|
915
|
+
```typescript
|
|
916
|
+
import {
|
|
917
|
+
otelInstrumentation,
|
|
918
|
+
otelMetricsInstrumentation,
|
|
919
|
+
otelLagGauge,
|
|
920
|
+
} from '@drarzter/kafka-client/otel';
|
|
921
|
+
|
|
922
|
+
const kafka = new KafkaClient('my-app', 'my-group', brokers, {
|
|
923
|
+
instrumentation: [otelInstrumentation(), otelMetricsInstrumentation()],
|
|
924
|
+
});
|
|
925
|
+
```
|
|
926
|
+
|
|
927
|
+
`otelMetricsInstrumentation()` registers seven instruments under the meter `@drarzter/kafka-client` (created once per instance, not per message):
|
|
928
|
+
|
|
929
|
+
| Instrument | Type | Attributes | Recorded when |
|
|
930
|
+
| ---------- | ---- | ---------- | ------------- |
|
|
931
|
+
| `kafka.client.messages.sent` | Counter | `topic` | a message is sent |
|
|
932
|
+
| `kafka.client.messages.processed` | Counter | `topic` | a handler succeeds |
|
|
933
|
+
| `kafka.client.messages.retried` | Counter | `topic` | a message is queued for retry |
|
|
934
|
+
| `kafka.client.messages.dlq` | Counter | `topic`, `reason` | a message is routed to a DLQ |
|
|
935
|
+
| `kafka.client.messages.duplicate` | Counter | `topic`, `strategy` | a Lamport-clock duplicate is detected |
|
|
936
|
+
| `kafka.client.consume.errors` | Counter | `topic` | a handler throws |
|
|
937
|
+
| `kafka.client.consume.duration` | Histogram (ms) | `topic` | measured across the handler's execution |
|
|
938
|
+
|
|
939
|
+
Pass a custom meter with `otelMetricsInstrumentation({ meter })` to route instruments through your own `MeterProvider`; it defaults to `metrics.getMeter('@drarzter/kafka-client')`.
|
|
940
|
+
|
|
941
|
+
`otelLagGauge()` registers an observable gauge `kafka.client.consumer.lag` (attributes `topic`, `partition`, `groupId`) that polls `getConsumerLag()` on each metric-collection cycle. It returns an **unregister disposer** — call it on shutdown to stop observing:
|
|
942
|
+
|
|
943
|
+
```typescript
|
|
944
|
+
const unregisterLag = otelLagGauge(kafka, { groupId: 'billing-service' });
|
|
945
|
+
|
|
946
|
+
// ...later, on shutdown:
|
|
947
|
+
unregisterLag();
|
|
948
|
+
```
|
|
949
|
+
|
|
950
|
+
`groupId` defaults to the client's constructor group (reported as an empty-string attribute), and `meter` overrides the meter as above. Lag-query failures during a collection cycle are swallowed silently — a broker hiccup reports no samples for that cycle rather than breaking metric collection. Both helpers require `@opentelemetry/api` as a peer dependency.
|
|
951
|
+
|
|
824
952
|
### Custom instrumentation
|
|
825
953
|
|
|
826
954
|
`beforeConsume` can return a `BeforeConsumeResult` — either the legacy `() => void` cleanup function, or an object with `cleanup` and/or `wrap`:
|
|
@@ -917,6 +1045,204 @@ Passing a topic name that has not seen any events returns a zero-valued snapshot
|
|
|
917
1045
|
|
|
918
1046
|
Counters are incremented in the same code paths that fire the corresponding hooks — they are always active regardless of whether any instrumentation is configured.
|
|
919
1047
|
|
|
1048
|
+
## Transport security
|
|
1049
|
+
|
|
1050
|
+
Configure TLS and SASL through the `security` option on `KafkaClientOptions`. The library applies **secure-by-default** rules so credentials never leak onto plaintext connections by accident:
|
|
1051
|
+
|
|
1052
|
+
- **SASL auto-enables TLS.** When `sasl` is set and `ssl` is left unset, `ssl` is turned on automatically — SASL credentials always travel over TLS unless you explicitly opt out.
|
|
1053
|
+
- **Explicit `ssl: false` with SASL warns.** Setting `sasl` together with `ssl: false` logs a warning that credentials will cross the wire in plaintext — only safe on fully trusted networks.
|
|
1054
|
+
- **Plaintext to non-local brokers warns once.** With no `ssl`/`sasl` at all and at least one non-local broker (anything outside `localhost`, `127.0.0.0/8`, `::1`, `0.0.0.0`, `host.docker.internal`), a single warning is logged per client. Acknowledge and silence it with `allowInsecure: true`.
|
|
1055
|
+
|
|
1056
|
+
Nothing here ever throws or blocks a connection — the defaults protect, you stay in control.
|
|
1057
|
+
|
|
1058
|
+
```typescript
|
|
1059
|
+
import { KafkaClient } from '@drarzter/kafka-client/core';
|
|
1060
|
+
|
|
1061
|
+
// SASL/SCRAM over TLS — ssl auto-enabled because sasl is set
|
|
1062
|
+
const kafka = new KafkaClient('billing-svc', 'billing-group', ['broker.example.com:9093'], {
|
|
1063
|
+
security: {
|
|
1064
|
+
sasl: {
|
|
1065
|
+
mechanism: 'scram-sha-512',
|
|
1066
|
+
username: 'billing-svc',
|
|
1067
|
+
password: process.env.KAFKA_PASSWORD!,
|
|
1068
|
+
},
|
|
1069
|
+
// ssl: true — inferred automatically; set explicitly if you prefer
|
|
1070
|
+
},
|
|
1071
|
+
});
|
|
1072
|
+
```
|
|
1073
|
+
|
|
1074
|
+
`KafkaSecurityOptions`:
|
|
1075
|
+
|
|
1076
|
+
| Field | Default | Description |
|
|
1077
|
+
| ----- | ------- | ----------- |
|
|
1078
|
+
| `ssl` | `true` when `sasl` set, else `false` | Enable TLS |
|
|
1079
|
+
| `sasl` | — | SASL authentication (see below) |
|
|
1080
|
+
| `allowInsecure` | `false` | Acknowledge an intentionally insecure (plaintext, non-local) setup and silence the warning. No effect when `ssl`/`sasl` are set |
|
|
1081
|
+
|
|
1082
|
+
`sasl` is a discriminated union on `mechanism`:
|
|
1083
|
+
|
|
1084
|
+
```typescript
|
|
1085
|
+
// Username / password mechanisms
|
|
1086
|
+
{ mechanism: 'plain' | 'scram-sha-256' | 'scram-sha-512', username: string, password: string }
|
|
1087
|
+
|
|
1088
|
+
// Token-based (AWS MSK IAM, GCP, custom)
|
|
1089
|
+
{ mechanism: 'oauthbearer', oauthBearerProvider: () => Promise<OAuthBearerToken> }
|
|
1090
|
+
```
|
|
1091
|
+
|
|
1092
|
+
An `OAuthBearerProvider` is an async factory the driver calls on connect and before each token expiry; it returns `{ value, principal?, lifetimeMs?, extensions? }`.
|
|
1093
|
+
|
|
1094
|
+
### AWS MSK IAM & GCP authentication
|
|
1095
|
+
|
|
1096
|
+
Two ready-made `oauthbearer` providers cover the common managed-Kafka cases. Both resolve credentials from the platform's standard chain — nothing to hard-code — and rely on an **optional** peer dependency you install alongside this library.
|
|
1097
|
+
|
|
1098
|
+
**AWS MSK IAM** — `awsMskIamProvider({ region })` delegates token signing to `aws-msk-iam-sasl-signer-js`. Credentials come from the standard AWS provider chain, so EKS IRSA, ECS task roles, and env credentials all work unchanged. Authorisation is then governed by IAM policies (`kafka-cluster:*` actions) — see [ACL requirements](#acl-requirements) to generate one:
|
|
1099
|
+
|
|
1100
|
+
```bash
|
|
1101
|
+
npm install aws-msk-iam-sasl-signer-js
|
|
1102
|
+
```
|
|
1103
|
+
|
|
1104
|
+
```typescript
|
|
1105
|
+
import { KafkaClient, awsMskIamProvider } from '@drarzter/kafka-client/core';
|
|
1106
|
+
|
|
1107
|
+
const kafka = new KafkaClient('orders-svc', 'orders-group', brokers, {
|
|
1108
|
+
security: {
|
|
1109
|
+
sasl: {
|
|
1110
|
+
mechanism: 'oauthbearer',
|
|
1111
|
+
oauthBearerProvider: awsMskIamProvider({ region: 'eu-west-1' }),
|
|
1112
|
+
},
|
|
1113
|
+
},
|
|
1114
|
+
});
|
|
1115
|
+
```
|
|
1116
|
+
|
|
1117
|
+
**GCP** — `gcpAccessTokenProvider()` delegates to `google-auth-library` using Application Default Credentials, so GKE Workload Identity, attached service accounts, and `GOOGLE_APPLICATION_CREDENTIALS` all work unchanged. It supplies a raw ADC access token; verify the exact token format your cluster expects against current Google documentation:
|
|
1118
|
+
|
|
1119
|
+
```bash
|
|
1120
|
+
npm install google-auth-library
|
|
1121
|
+
```
|
|
1122
|
+
|
|
1123
|
+
```typescript
|
|
1124
|
+
import { KafkaClient, gcpAccessTokenProvider } from '@drarzter/kafka-client/core';
|
|
1125
|
+
|
|
1126
|
+
const kafka = new KafkaClient('events-svc', 'events-group', brokers, {
|
|
1127
|
+
security: {
|
|
1128
|
+
sasl: {
|
|
1129
|
+
mechanism: 'oauthbearer',
|
|
1130
|
+
oauthBearerProvider: gcpAccessTokenProvider(),
|
|
1131
|
+
},
|
|
1132
|
+
},
|
|
1133
|
+
});
|
|
1134
|
+
```
|
|
1135
|
+
|
|
1136
|
+
| Provider | Options | Optional peer dep |
|
|
1137
|
+
| -------- | ------- | ----------------- |
|
|
1138
|
+
| `awsMskIamProvider` | `{ region }` | `aws-msk-iam-sasl-signer-js` |
|
|
1139
|
+
| `gcpAccessTokenProvider` | `{ scopes?, principal?, tokenTtlMs? }` (defaults: `cloud-platform` scope, principal `'gcp'`, 50 min TTL) | `google-auth-library` |
|
|
1140
|
+
|
|
1141
|
+
Neither package is a hard dependency — they are dynamically imported on first token fetch. If the package is missing, the provider throws a clear install hint rather than failing at build time.
|
|
1142
|
+
|
|
1143
|
+
### ACL requirements
|
|
1144
|
+
|
|
1145
|
+
The features that make this library convenient — retry topics, DLQ, delayed delivery, deduplication routing, DLQ replay, snapshots, clock recovery — quietly create **extra topics and consumer groups** (`<topic>.retry.N`, `<topic>.dlq`, `<topic>.delayed`, `<topic>.duplicates`, `<groupId>-retry.N`, timestamped ephemeral groups, transactional ids). On a locked-down cluster every one of them needs an ACL, and the last place you want to discover a missing grant is production at 3 a.m.
|
|
1146
|
+
|
|
1147
|
+
`describeRequiredAcls()` enumerates the complete set from a declarative usage profile. Feed the result to `toKafkaAclCommands()` for `kafka-acls.sh` commands, or `toMskIamPolicy()` for an AWS MSK IAM policy document:
|
|
1148
|
+
|
|
1149
|
+
```typescript
|
|
1150
|
+
import {
|
|
1151
|
+
describeRequiredAcls,
|
|
1152
|
+
toKafkaAclCommands,
|
|
1153
|
+
toMskIamPolicy,
|
|
1154
|
+
} from '@drarzter/kafka-client/core';
|
|
1155
|
+
|
|
1156
|
+
const resources = describeRequiredAcls({
|
|
1157
|
+
clientId: 'billing-svc',
|
|
1158
|
+
groupIds: ['billing-svc-group'],
|
|
1159
|
+
produceTopics: ['invoices.created'],
|
|
1160
|
+
consumeTopics: ['orders.created'],
|
|
1161
|
+
features: {
|
|
1162
|
+
retryTopics: { maxRetries: 3 },
|
|
1163
|
+
dlq: true,
|
|
1164
|
+
dlqReplay: true,
|
|
1165
|
+
transactions: true,
|
|
1166
|
+
},
|
|
1167
|
+
});
|
|
1168
|
+
|
|
1169
|
+
// Render kafka-acls.sh commands for a principal
|
|
1170
|
+
for (const cmd of toKafkaAclCommands(resources, 'User:billing-svc', 'broker:9092')) {
|
|
1171
|
+
console.log(cmd);
|
|
1172
|
+
}
|
|
1173
|
+
// kafka-acls.sh --bootstrap-server broker:9092 --add --allow-principal 'User:billing-svc' \
|
|
1174
|
+
// --operation READ --operation DESCRIBE --topic 'orders.created' # startConsumer
|
|
1175
|
+
// kafka-acls.sh ... --topic 'orders.created.dlq' # dlq: true — failed messages routed to DLQ
|
|
1176
|
+
// kafka-acls.sh ... --topic 'orders.created.retry.1' ... --topic 'orders.created.retry.3'
|
|
1177
|
+
// kafka-acls.sh ... --group 'billing-svc-group-retry.' --resource-pattern-type prefixed
|
|
1178
|
+
// kafka-acls.sh ... --transactional-id 'billing-svc-group-' --resource-pattern-type prefixed
|
|
1179
|
+
// kafka-acls.sh ... --group 'orders.created.dlq-replay' --operation DELETE --resource-pattern-type prefixed
|
|
1180
|
+
// ...
|
|
1181
|
+
|
|
1182
|
+
// Or an MSK IAM policy document
|
|
1183
|
+
const policy = toMskIamPolicy(resources, {
|
|
1184
|
+
region: 'eu-west-1',
|
|
1185
|
+
accountId: '123456789012',
|
|
1186
|
+
clusterName: 'prod',
|
|
1187
|
+
clusterUuid: 'abcd-1234',
|
|
1188
|
+
});
|
|
1189
|
+
```
|
|
1190
|
+
|
|
1191
|
+
`describeRequiredAcls()` returns `AclResource[]`, each carrying `resourceType` (`topic` | `group` | `transactional-id` | `cluster`), `patternType` (`literal` | `prefixed`), `name`, `operations`, and a `reason` naming the feature that requires it. Ephemeral-group features (`dlqReplay`, `snapshots`, `clockRecovery`) request `DELETE` on a **prefixed** pattern, because those groups are timestamped and cleaned up after use.
|
|
1192
|
+
|
|
1193
|
+
| Feature flag | Adds |
|
|
1194
|
+
| ------------ | ---- |
|
|
1195
|
+
| `dlq` | `<topic>.dlq` WRITE per consumed topic |
|
|
1196
|
+
| `retryTopics: { maxRetries }` | `<topic>.retry.1…N` topics; `<groupId>-retry.` prefixed groups; `<groupId>-` prefixed transactional ids |
|
|
1197
|
+
| `delayedDelivery` | `<topic>.delayed` topics; `<groupId>-delayed-relay` group + `-tx` id |
|
|
1198
|
+
| `duplicatesTopic` | `<topic>.duplicates` (or a custom topic name) WRITE |
|
|
1199
|
+
| `dlqReplay` | `<topic>.dlq-replay` prefixed groups (READ, DESCRIBE, **DELETE**) + DLQ READ |
|
|
1200
|
+
| `snapshots` | `<clientId>-snapshot-` prefixed groups (READ, DESCRIBE, **DELETE**) |
|
|
1201
|
+
| `clockRecovery` | `<clientId>-clock-recovery-` prefixed groups (READ, DESCRIBE, **DELETE**) |
|
|
1202
|
+
| `transactions` | `<clientId>-tx` transactional id |
|
|
1203
|
+
| `autoCreateTopics` | cluster `CREATE` (avoid in production) |
|
|
1204
|
+
|
|
1205
|
+
`toMskIamPolicy()` maps Kafka operations to `kafka-cluster:*` actions, turns prefixed patterns into `name*` ARN wildcards, and always includes `kafka-cluster:Connect`. **Review both outputs against your organisation's least-privilege standards and current AWS documentation before applying** — they are a starting point, not a rubber stamp.
|
|
1206
|
+
|
|
1207
|
+
## Environment configuration
|
|
1208
|
+
|
|
1209
|
+
Build client and consumer configuration from environment variables with a strict precedence rule: **explicit code options > env vars > built-in library defaults**. The helpers only *feed* values in — anything you hard-code always wins, and any variable left unset keeps the library default.
|
|
1210
|
+
|
|
1211
|
+
The library never reads a `.env` file itself. Load one first with Node's built-in `node --env-file=.env` (Node 20.6+) or the `dotenv` package, then call the helpers:
|
|
1212
|
+
|
|
1213
|
+
```typescript
|
|
1214
|
+
import { KafkaClient, kafkaClientConfigFromEnv } from '@drarzter/kafka-client/core';
|
|
1215
|
+
|
|
1216
|
+
const { clientId, groupId, brokers, options } = kafkaClientConfigFromEnv();
|
|
1217
|
+
|
|
1218
|
+
const kafka = new KafkaClient(
|
|
1219
|
+
clientId ?? 'my-svc', // env value or your fallback
|
|
1220
|
+
groupId ?? 'my-grp',
|
|
1221
|
+
brokers ?? ['localhost:9092'],
|
|
1222
|
+
{
|
|
1223
|
+
...options, // only the keys whose env vars were present
|
|
1224
|
+
onMessageLost: alerting, // code-level value — always applied, not env-configurable
|
|
1225
|
+
},
|
|
1226
|
+
);
|
|
1227
|
+
```
|
|
1228
|
+
|
|
1229
|
+
`kafkaClientConfigFromEnv(env?, prefix?)` reads `KAFKA_`-prefixed variables (`CLIENT_ID`, `GROUP_ID`, `BROKERS`, `AUTO_CREATE_TOPICS`, `STRICT_SCHEMAS`, `NUM_PARTITIONS`, `TRANSACTIONAL_ID`, `CLOCK_RECOVERY_*`, `LAG_THROTTLE_*`, and the security vars `SSL`, `SASL_MECHANISM`, `SASL_USERNAME`, `SASL_PASSWORD`, `ALLOW_INSECURE`). It returns `{ clientId?, groupId?, brokers?, options }`, emitting only the keys whose variables were set. Malformed booleans/numbers/enums throw with the offending variable named. `oauthbearer` cannot come from env — token providers are functions, so configure them in code.
|
|
1230
|
+
|
|
1231
|
+
`consumerOptionsFromEnv(env?, prefix?)` reads `KAFKA_CONSUMER_`-prefixed variables into a `Partial<ConsumerOptions>` (retry, DLQ, deduplication, circuit breaker, TTL, `GROUP_INSTANCE_ID`, and more). Merge it under your code-level options with `mergeConsumerOptions()`, which applies the precedence rule — later layers win, and the nested objects (`retry`, `deduplication`, `circuitBreaker`, `subscribeRetry`) are deep-merged so a code layer can override a single field:
|
|
1232
|
+
|
|
1233
|
+
```typescript
|
|
1234
|
+
import { consumerOptionsFromEnv, mergeConsumerOptions } from '@drarzter/kafka-client/core';
|
|
1235
|
+
|
|
1236
|
+
const envDefaults = consumerOptionsFromEnv();
|
|
1237
|
+
await kafka.startConsumer(
|
|
1238
|
+
['orders'],
|
|
1239
|
+
handler,
|
|
1240
|
+
mergeConsumerOptions(envDefaults, { dlq: true }), // code layer wins on conflict
|
|
1241
|
+
);
|
|
1242
|
+
```
|
|
1243
|
+
|
|
1244
|
+
Both helpers accept an explicit `env` object (handy in tests) and a custom variable `prefix`. See [`docs/configuration.md`](./docs/configuration.md) for the full variable reference and [`.env.example`](./.env.example) for a ready-to-copy template.
|
|
1245
|
+
|
|
920
1246
|
## Options reference
|
|
921
1247
|
|
|
922
1248
|
### Send options
|
|
@@ -931,8 +1257,9 @@ Options for `sendMessage()` — the third argument:
|
|
|
931
1257
|
| `schemaVersion` | `1` | Schema version for the payload |
|
|
932
1258
|
| `eventId` | auto | Override the auto-generated event ID (UUID v4) |
|
|
933
1259
|
| `compression` | — | Compression codec for the message set: `'gzip'`, `'snappy'`, `'lz4'`, `'zstd'`; omit to send uncompressed |
|
|
1260
|
+
| `deliverAfterMs` | — | Delay delivery by at least this many milliseconds via a `<topic>.delayed` staging topic; requires a running `startDelayedRelay()` (see [Delayed delivery](#delayed-delivery)) |
|
|
934
1261
|
|
|
935
|
-
`sendBatch()` accepts `compression` as
|
|
1262
|
+
`sendBatch()` accepts `compression` and `deliverAfterMs` as top-level options (not per-message); all other options are per-message inside the array items.
|
|
936
1263
|
|
|
937
1264
|
### Consumer options
|
|
938
1265
|
|
|
@@ -951,15 +1278,17 @@ Options for `sendMessage()` — the third argument:
|
|
|
951
1278
|
| `handlerTimeoutMs` | — | Log a warning if the handler hasn't resolved within this window (ms) — does not cancel the handler |
|
|
952
1279
|
| `deduplication.strategy` | `'drop'` | What to do with duplicate messages: `'drop'` silently discards, `'dlq'` forwards to `{topic}.dlq` (requires `dlq: true`), `'topic'` forwards to `{topic}.duplicates` |
|
|
953
1280
|
| `deduplication.duplicatesTopic` | `{topic}.duplicates` | Custom destination for `strategy: 'topic'` |
|
|
1281
|
+
| `deduplication.store` | in-memory | Pluggable `DedupStore` for the per-partition last-processed clock; supply a persistent store (e.g. Redis) so dedup survives restarts/rebalances (see [Pluggable deduplication store](#pluggable-deduplication-store)) |
|
|
954
1282
|
| `messageTtlMs` | — | Drop (or DLQ) messages older than this many milliseconds at consumption time; evaluated against the `x-timestamp` header; see [Message TTL](#message-ttl) |
|
|
955
|
-
| `circuitBreaker` | — | Enable circuit breaker with `{}` for zero-config defaults;
|
|
956
|
-
| `circuitBreaker.threshold` | `5` |
|
|
1283
|
+
| `circuitBreaker` | — | Enable circuit breaker with `{}` for zero-config defaults; see [Circuit breaker](#circuit-breaker) |
|
|
1284
|
+
| `circuitBreaker.threshold` | `5` | Failed handler attempts within `windowSize` that open the circuit |
|
|
957
1285
|
| `circuitBreaker.recoveryMs` | `30_000` | Milliseconds to wait in OPEN state before entering HALF_OPEN |
|
|
958
1286
|
| `circuitBreaker.windowSize` | `threshold × 2, min 10` | Sliding window size in messages |
|
|
959
1287
|
| `circuitBreaker.halfOpenSuccesses` | `1` | Consecutive successes in HALF_OPEN required to close the circuit |
|
|
960
1288
|
| `queueHighWaterMark` | unbounded | Max messages buffered in the `consume()` iterator queue before the partition is paused; resumes at 50% drain. Only applies to `consume()` |
|
|
961
1289
|
| `batch` | `false` | (decorator only) Use `startBatchConsumer` instead of `startConsumer` |
|
|
962
1290
|
| `partitionAssigner` | `'cooperative-sticky'` | Partition assignment strategy: `'cooperative-sticky'` (minimal movement on rebalance, best for horizontal scaling), `'roundrobin'` (even distribution), `'range'` (contiguous partition ranges) |
|
|
1291
|
+
| `groupInstanceId` | — | Static group membership (`group.instance.id`) — a member that restarts within `session.timeout.ms` rejoins with the same partitions and no rebalance. Must be unique per member; not propagated to retry companions. See [Static group membership](#static-group-membership) |
|
|
963
1292
|
| `onTtlExpired` | — | Per-consumer override of the client-level `onTtlExpired` callback; takes precedence when set. Receives `TtlExpiredContext` — same shape as the client-level hook |
|
|
964
1293
|
| `onMessageLost` | — | Per-consumer override of the client-level `onMessageLost` callback; takes precedence when set. Use for consumer-specific dead-message alerting or structured logging |
|
|
965
1294
|
| `onRetry` | — | Per-consumer retry callback; fires **in addition to** the built-in metrics hook (does not replace it). Same signature as `KafkaInstrumentation.onRetry` |
|
|
@@ -980,11 +1309,34 @@ Passed to `KafkaModule.register()` or returned from `registerAsync()` factory:
|
|
|
980
1309
|
| `autoCreateTopics` | `false` | Auto-create topics on first send (dev only) |
|
|
981
1310
|
| `numPartitions` | `1` | Number of partitions for auto-created topics |
|
|
982
1311
|
| `strictSchemas` | `true` | Validate string topic keys against schemas registered via TopicDescriptor |
|
|
1312
|
+
| `security` | — | TLS + SASL transport security with secure-by-default rules (`{ ssl, sasl, allowInsecure }`); see [Transport security](#transport-security) |
|
|
983
1313
|
| `instrumentation` | `[]` | Client-wide instrumentation hooks (e.g. OTel). Applied to both send and consume paths |
|
|
984
1314
|
| `transactionalId` | `${clientId}-tx` | Transactional producer ID for `transaction()` calls. Must be unique per producer instance across the cluster — two instances sharing the same ID will be fenced by Kafka. The client logs a warning when the same ID is registered twice within one process |
|
|
985
1315
|
| `onMessageLost` | — | Called when a message is silently dropped without DLQ — use to alert, log to external systems, or trigger fallback logic |
|
|
986
1316
|
| `onTtlExpired` | — | Called when a message is dropped due to TTL expiration (`messageTtlMs`) and `dlq` is not enabled; receives `{ topic, ageMs, messageTtlMs, headers }` |
|
|
987
1317
|
| `onRebalance` | — | Called on every partition assign/revoke event across all consumers created by this client |
|
|
1318
|
+
| `clockRecovery.topics` | — | Topics to scan on `connectProducer()` to recover the highest `x-lamport-clock`, so the clock stays monotonic across restarts (see [Deduplication](#deduplication-lamport-clock)) |
|
|
1319
|
+
| `clockRecovery.timeoutMs` | `30000` | Max time (ms) to wait for clock recovery before proceeding with a partial result |
|
|
1320
|
+
| `lagThrottle` | — | Delay sends when a consumer group's lag exceeds `maxLag` (see [Lag-based producer throttling](#lag-based-producer-throttling)) |
|
|
1321
|
+
| `lagThrottle.maxLag` | — | Lag threshold (messages) above which sends are delayed (required when `lagThrottle` is set) |
|
|
1322
|
+
| `lagThrottle.groupId` | default group | Consumer group whose lag is monitored |
|
|
1323
|
+
| `lagThrottle.pollIntervalMs` | `5000` | How often (ms) to poll `getConsumerLag()` in the background |
|
|
1324
|
+
| `lagThrottle.maxWaitMs` | `30000` | Max time (ms) a send waits while throttled before proceeding anyway (best-effort, not hard back-pressure) |
|
|
1325
|
+
| `transport` | `ConfluentTransport` | Custom `KafkaTransport` implementation — target an alternative broker library or inject a deterministic fake in tests |
|
|
1326
|
+
|
|
1327
|
+
> **Advanced — direct transport access.** `ConfluentTransport` and the full
|
|
1328
|
+
> `KafkaTransport` interface family (`IProducer`, `IConsumer`, `IAdmin`, …) are
|
|
1329
|
+
> exported from `@drarzter/kafka-client/core`. When you need low-level admin
|
|
1330
|
+
> operations the facade does not expose (e.g. per-partition watermarks), build a
|
|
1331
|
+
> transport instead of deep-importing the raw driver:
|
|
1332
|
+
>
|
|
1333
|
+
> ```typescript
|
|
1334
|
+
> import { ConfluentTransport } from '@drarzter/kafka-client/core';
|
|
1335
|
+
>
|
|
1336
|
+
> const admin = new ConfluentTransport('ops-cli', brokers).admin();
|
|
1337
|
+
> await admin.connect();
|
|
1338
|
+
> const watermarks = await admin.fetchTopicOffsets('orders'); // [{ partition, low, high }]
|
|
1339
|
+
> ```
|
|
988
1340
|
|
|
989
1341
|
**Module-scoped** (default) — import `KafkaModule` in each module that needs it:
|
|
990
1342
|
|
|
@@ -1147,6 +1499,50 @@ Deduplication state is **in-memory and per-consumer-instance**. Understand what
|
|
|
1147
1499
|
|
|
1148
1500
|
Use this feature as a lightweight first line of defence — not as a substitute for idempotent business logic.
|
|
1149
1501
|
|
|
1502
|
+
### Pluggable deduplication store
|
|
1503
|
+
|
|
1504
|
+
The in-memory limitation above is only the **default**. Pass a `store` in `deduplication` to back the per-partition clock with any external system — Redis, a database, anything — so deduplication survives process restarts and rebalances. The store implements the `DedupStore` interface:
|
|
1505
|
+
|
|
1506
|
+
```typescript
|
|
1507
|
+
import { DedupStore } from '@drarzter/kafka-client';
|
|
1508
|
+
|
|
1509
|
+
interface DedupStore {
|
|
1510
|
+
// Return the last processed clock for a group + "topic:partition", or undefined.
|
|
1511
|
+
getLastClock(groupId: string, topicPartition: string): number | undefined | Promise<number | undefined>;
|
|
1512
|
+
// Persist the last processed clock for a group + "topic:partition".
|
|
1513
|
+
setLastClock(groupId: string, topicPartition: string, clock: number): void | Promise<void>;
|
|
1514
|
+
}
|
|
1515
|
+
```
|
|
1516
|
+
|
|
1517
|
+
Both methods may be synchronous or return a promise. A minimal Redis-backed store:
|
|
1518
|
+
|
|
1519
|
+
```typescript
|
|
1520
|
+
class RedisDedupStore implements DedupStore {
|
|
1521
|
+
constructor(private readonly redis: RedisClient) {}
|
|
1522
|
+
|
|
1523
|
+
private key(groupId: string, topicPartition: string) {
|
|
1524
|
+
return `dedup:${groupId}:${topicPartition}`;
|
|
1525
|
+
}
|
|
1526
|
+
|
|
1527
|
+
async getLastClock(groupId: string, topicPartition: string) {
|
|
1528
|
+
const raw = await this.redis.get(this.key(groupId, topicPartition));
|
|
1529
|
+
return raw === null ? undefined : Number(raw);
|
|
1530
|
+
}
|
|
1531
|
+
|
|
1532
|
+
async setLastClock(groupId: string, topicPartition: string, clock: number) {
|
|
1533
|
+
await this.redis.set(this.key(groupId, topicPartition), String(clock));
|
|
1534
|
+
}
|
|
1535
|
+
}
|
|
1536
|
+
|
|
1537
|
+
await kafka.startConsumer(['payments'], handler, {
|
|
1538
|
+
deduplication: { strategy: 'drop', store: new RedisDedupStore(redis) },
|
|
1539
|
+
});
|
|
1540
|
+
```
|
|
1541
|
+
|
|
1542
|
+
**Failure semantics (fail-open):** if `getLastClock` or `setLastClock` throws or rejects, the error is logged and the message is treated as **not** a duplicate. A transient store outage never silently drops messages — it only weakens deduplication until the store recovers, biasing towards at-least-once delivery.
|
|
1543
|
+
|
|
1544
|
+
When `store` is omitted, the built-in `InMemoryDedupStore` is used — the in-session behaviour described above.
|
|
1545
|
+
|
|
1150
1546
|
## Retry topic chain
|
|
1151
1547
|
|
|
1152
1548
|
> **tl;dr — recommended production setup:**
|
|
@@ -1246,9 +1642,9 @@ Pausing is non-destructive: the consumer stays connected and Kafka preserves the
|
|
|
1246
1642
|
|
|
1247
1643
|
## Circuit breaker
|
|
1248
1644
|
|
|
1249
|
-
Automatically pause delivery from a topic-partition when its
|
|
1645
|
+
Automatically pause delivery from a topic-partition when its handler failure rate exceeds a threshold. After a recovery window the partition is resumed automatically.
|
|
1250
1646
|
|
|
1251
|
-
|
|
1647
|
+
Failures are recorded at the handler-error boundary: every failed handler attempt counts (including in-process retries and retry-topic chain levels), independent of whether the message ends up in a DLQ. `dlq` is **not** required for the breaker to work.
|
|
1252
1648
|
|
|
1253
1649
|
Zero-config start — all options have sensible defaults:
|
|
1254
1650
|
|
|
@@ -1287,7 +1683,7 @@ Options:
|
|
|
1287
1683
|
|
|
1288
1684
|
| Option | Default | Description |
|
|
1289
1685
|
| ------ | ------- | ----------- |
|
|
1290
|
-
| `threshold` | `5` |
|
|
1686
|
+
| `threshold` | `5` | Failed handler attempts within `windowSize` that open the circuit |
|
|
1291
1687
|
| `recoveryMs` | `30_000` | Milliseconds to wait in OPEN state before entering HALF_OPEN |
|
|
1292
1688
|
| `windowSize` | `threshold × 2, min 10` | Sliding window size in messages |
|
|
1293
1689
|
| `halfOpenSuccesses` | `1` | Consecutive successes in HALF_OPEN required to close the circuit |
|
|
@@ -1385,7 +1781,7 @@ await kafka.seekToTimestamp('payments-group', [
|
|
|
1385
1781
|
]);
|
|
1386
1782
|
```
|
|
1387
1783
|
|
|
1388
|
-
Uses `admin.
|
|
1784
|
+
Uses `admin.fetchTopicOffsetsByTimestamp` under the hood. If no offset exists at the requested timestamp (e.g. the partition is empty or the timestamp is in the future), the partition falls back to the current high watermark (end of topic — new messages only).
|
|
1389
1785
|
|
|
1390
1786
|
**Important:** the consumer group must be stopped before seeking. Assignments for the same topic are batched into a single `admin.setOffsets` call.
|
|
1391
1787
|
|
|
@@ -1691,6 +2087,169 @@ await kafka.startTransactionalConsumer(
|
|
|
1691
2087
|
|
|
1692
2088
|
`retryTopics: true` is rejected at startup — EOS redelivery on failure is already guaranteed by the transaction. `autoCommit` is always `false` (managed internally).
|
|
1693
2089
|
|
|
2090
|
+
## Transactional outbox
|
|
2091
|
+
|
|
2092
|
+
The transactional-outbox pattern decouples "write my business state" from "publish an event" so the two can never diverge. Application code writes an event row into an outbox table **in the same DB transaction** as its business writes; a relay polls that table and publishes the rows to Kafka, marking them published only after Kafka has acked them. If the process dies after the DB commit but before the publish, the row is still there and gets published on the next poll — the event is never lost.
|
|
2093
|
+
|
|
2094
|
+
`startOutboxRelay()` runs that relay against any `OutboxStore` you implement. The library never touches your database — you own the schema and the queries; it only needs to read unpublished rows oldest-first and durably mark rows published:
|
|
2095
|
+
|
|
2096
|
+
```typescript
|
|
2097
|
+
import { startOutboxRelay, OutboxStore } from '@drarzter/kafka-client/core';
|
|
2098
|
+
|
|
2099
|
+
// Pseudo-Postgres store — you own the table and the SQL.
|
|
2100
|
+
const store: OutboxStore = {
|
|
2101
|
+
async fetchUnpublished(limit) {
|
|
2102
|
+
const { rows } = await pool.query(
|
|
2103
|
+
`SELECT id, topic, payload, key, correlation_id AS "correlationId",
|
|
2104
|
+
event_id AS "eventId", headers
|
|
2105
|
+
FROM outbox
|
|
2106
|
+
WHERE published_at IS NULL
|
|
2107
|
+
ORDER BY created_at ASC
|
|
2108
|
+
LIMIT $1`,
|
|
2109
|
+
[limit],
|
|
2110
|
+
);
|
|
2111
|
+
return rows;
|
|
2112
|
+
},
|
|
2113
|
+
async markPublished(ids) {
|
|
2114
|
+
await pool.query(`UPDATE outbox SET published_at = now() WHERE id = ANY($1)`, [ids]);
|
|
2115
|
+
},
|
|
2116
|
+
};
|
|
2117
|
+
|
|
2118
|
+
await kafka.connectProducer();
|
|
2119
|
+
|
|
2120
|
+
const relay = startOutboxRelay(kafka, store, {
|
|
2121
|
+
pollIntervalMs: 500, // default 1000
|
|
2122
|
+
batchSize: 200, // default 100 — rows fetched & published per tick
|
|
2123
|
+
onPublished: (n) => metrics.increment('outbox.published', n),
|
|
2124
|
+
onError: (err, batch) => logger.error(`outbox batch of ${batch.length} failed`, err),
|
|
2125
|
+
});
|
|
2126
|
+
|
|
2127
|
+
// On shutdown — stop() halts the timer and awaits any in-flight iteration:
|
|
2128
|
+
await relay.stop();
|
|
2129
|
+
await kafka.disconnect();
|
|
2130
|
+
```
|
|
2131
|
+
|
|
2132
|
+
Meanwhile, application code inserts outbox rows inside its business transaction:
|
|
2133
|
+
|
|
2134
|
+
```typescript
|
|
2135
|
+
// Inside a DB transaction, alongside your business INSERT/UPDATE:
|
|
2136
|
+
await tx.query(
|
|
2137
|
+
`INSERT INTO outbox (id, topic, payload, key, correlation_id, event_id)
|
|
2138
|
+
VALUES ($1, $2, $3, $4, $5, $6)`,
|
|
2139
|
+
[randomUUID(), 'orders.created', JSON.stringify(order), order.id, corrId, eventId],
|
|
2140
|
+
);
|
|
2141
|
+
```
|
|
2142
|
+
|
|
2143
|
+
**Delivery guarantee: at-least-once.** Each poll publishes the whole batch inside **one Kafka transaction**, then marks the rows published. If the process crashes *after* the Kafka commit but *before* `markPublished`, those rows are re-published on the next tick — a **duplicate**. Persist a stable `eventId` on each row (surfaced as `x-event-id`) so consumers can deduplicate, either via this library's [Lamport-clock deduplication](#deduplication-lamport-clock) or an application-level idempotency check. Iterations never overlap; the loop never dies on error.
|
|
2144
|
+
|
|
2145
|
+
`OutboxStore` interface:
|
|
2146
|
+
|
|
2147
|
+
| Method | Description |
|
|
2148
|
+
| ------ | ----------- |
|
|
2149
|
+
| `fetchUnpublished(limit): Promise<OutboxMessage[]>` | Unpublished rows, oldest first, capped at `limit`. Empty array = nothing to do |
|
|
2150
|
+
| `markPublished(ids): Promise<void>` | Durably mark ids published; called only after Kafka acks. Idempotent |
|
|
2151
|
+
|
|
2152
|
+
An `InMemoryOutboxStore` (with `.add()`, `pendingCount`, `publishedCount`) ships for tests and as executable documentation — it is **not** durable, so it does not provide the "same DB transaction as the business write" guarantee that is the whole point of the pattern. A full Postgres reference implementation lives in [`src/integration/postgres-outbox.integration.spec.ts`](./src/integration/postgres-outbox.integration.spec.ts).
|
|
2153
|
+
|
|
2154
|
+
## Serialization: JSON, Avro, Protobuf
|
|
2155
|
+
|
|
2156
|
+
By default every message value is serialized as JSON — no configuration needed.
|
|
2157
|
+
Serialization is a pluggable seam (`MessageSerde`): swap in Avro or Protobuf
|
|
2158
|
+
with **Confluent wire format** (`[magic 0x00][4-byte schema id][payload]`) to
|
|
2159
|
+
interoperate with Java/Go producers and consumers through a Schema Registry.
|
|
2160
|
+
|
|
2161
|
+
```typescript
|
|
2162
|
+
import { KafkaClient, topic } from '@drarzter/kafka-client/core';
|
|
2163
|
+
import { avroSerde } from '@drarzter/kafka-client/serde';
|
|
2164
|
+
import { SchemaRegistryClient } from '@drarzter/kafka-client/core';
|
|
2165
|
+
|
|
2166
|
+
const registry = new SchemaRegistryClient({ baseUrl: 'http://localhost:8081' });
|
|
2167
|
+
|
|
2168
|
+
const orderSchema = {
|
|
2169
|
+
type: 'record', name: 'Order',
|
|
2170
|
+
fields: [{ name: 'orderId', type: 'string' }, { name: 'amount', type: 'int' }],
|
|
2171
|
+
};
|
|
2172
|
+
|
|
2173
|
+
// Client-wide: every value goes through Avro.
|
|
2174
|
+
const kafka = new KafkaClient('orders-svc', 'orders-grp', ['localhost:9092'], {
|
|
2175
|
+
serde: avroSerde({ registry, schema: orderSchema, autoRegister: true }),
|
|
2176
|
+
});
|
|
2177
|
+
|
|
2178
|
+
// …or per-topic (JSON elsewhere, Avro just here):
|
|
2179
|
+
const OrderCreated = topic('order.created')
|
|
2180
|
+
.serde(avroSerde({ registry, schema: orderSchema, autoRegister: true }))
|
|
2181
|
+
.type<{ orderId: string; amount: number }>();
|
|
2182
|
+
```
|
|
2183
|
+
|
|
2184
|
+
`protobufSerde({ registry, schema: protoSource, messageType: 'Order', autoRegister: true })`
|
|
2185
|
+
works the same way. `avsc` / `protobufjs` are **optional peer dependencies** —
|
|
2186
|
+
install only the one you use (`npm i avsc` or `npm i protobufjs`); a clear error
|
|
2187
|
+
tells you if it's missing.
|
|
2188
|
+
|
|
2189
|
+
**Serde options.** `registry` (required); `schema` (Avro JSON / `.proto` source —
|
|
2190
|
+
required to serialize); `subject?` (defaults to Confluent TopicNameStrategy
|
|
2191
|
+
`<topic>-value` / `<topic>-key`); `autoRegister?` (register the schema on first
|
|
2192
|
+
send to obtain its id — handy in dev; default `false` reads the latest registered
|
|
2193
|
+
schema instead). Parsed schemas and id→schema lookups are cached.
|
|
2194
|
+
|
|
2195
|
+
**Custom serde.** Implement `MessageSerde` (`serialize(value, ctx) → Buffer | string`,
|
|
2196
|
+
`deserialize(data, ctx) → value`) for MessagePack, CBOR, encryption, etc. `JsonSerde`
|
|
2197
|
+
is the default and is exported for composition.
|
|
2198
|
+
|
|
2199
|
+
**Notes & limits (v0.11):** the envelope headers (`x-event-id`, Lamport clock,
|
|
2200
|
+
`traceparent`, …) always travel as Kafka headers regardless of value serde. DLQ,
|
|
2201
|
+
retry-topic, duplicates, and delayed-relay forwarding preserve the original wire
|
|
2202
|
+
bytes losslessly, so binary formats survive every hop. Avro currently uses the
|
|
2203
|
+
writer schema as the reader schema (no reader-schema evolution yet); Protobuf
|
|
2204
|
+
supports the top-level message type only; `readSnapshot` remains JSON-only.
|
|
2205
|
+
|
|
2206
|
+
## Schema Registry client
|
|
2207
|
+
|
|
2208
|
+
`SchemaRegistryClient` is a minimal, dependency-free client for the Confluent Schema Registry REST API (works with Confluent Platform/Cloud, Redpanda, Karapace, and the AWS Glue SR proxy). Its scope is **subject/version management, compatibility checks, and id→schema lookups** — used both to keep your locally-defined schemas in lockstep with a central registry and as the backing lookup for the Avro/Protobuf serdes (see [Serialization: JSON, Avro, Protobuf](#serialization-json-avro-protobuf)).
|
|
2209
|
+
|
|
2210
|
+
```typescript
|
|
2211
|
+
import { SchemaRegistryClient } from '@drarzter/kafka-client/core';
|
|
2212
|
+
|
|
2213
|
+
const registry = new SchemaRegistryClient({
|
|
2214
|
+
baseUrl: 'http://localhost:8081',
|
|
2215
|
+
auth: { username: apiKey, password: apiSecret }, // optional HTTP Basic (Confluent Cloud)
|
|
2216
|
+
cacheTtlMs: 300_000, // latest-version cache TTL — default 5 min
|
|
2217
|
+
});
|
|
2218
|
+
|
|
2219
|
+
// Register (idempotent — re-registering the same schema returns the existing id)
|
|
2220
|
+
const { id } = await registry.registerSchema('order.created-value', JSON.stringify(orderJsonSchema), 'JSON');
|
|
2221
|
+
|
|
2222
|
+
// Fetch (getLatestSchema is cached; getSchemaVersion is not)
|
|
2223
|
+
const latest = await registry.getLatestSchema('order.created-value');
|
|
2224
|
+
const v2 = await registry.getSchemaVersion('order.created-value', 2);
|
|
2225
|
+
|
|
2226
|
+
// Check compatibility against the subject's policy without registering
|
|
2227
|
+
const ok = await registry.checkCompatibility('order.created-value', JSON.stringify(candidate));
|
|
2228
|
+
```
|
|
2229
|
+
|
|
2230
|
+
| Method | Cached | Description |
|
|
2231
|
+
| ------ | ------ | ----------- |
|
|
2232
|
+
| `getLatestSchema(subject)` | yes (`cacheTtlMs`) | Latest `{ id, version, schema }` for a subject |
|
|
2233
|
+
| `getSchemaVersion(subject, version)` | no | A specific registered version |
|
|
2234
|
+
| `registerSchema(subject, schema, schemaType?)` | invalidates cache | Register (idempotent); returns `{ id }`. `schemaType` defaults to `'JSON'` |
|
|
2235
|
+
| `checkCompatibility(subject, schema, schemaType?)` | no | `true` when the registry reports the schema compatible |
|
|
2236
|
+
|
|
2237
|
+
`registrySchema()` bridges a registry subject to this library's `SchemaLike` seam so you can attach it to a `TopicDescriptor` like any other schema. On each `parse` it resolves the subject's latest version (cached), optionally verifies the message's `x-schema-version` is not newer than what is registered, and delegates structural validation to a local validator:
|
|
2238
|
+
|
|
2239
|
+
```typescript
|
|
2240
|
+
import { topic, registrySchema } from '@drarzter/kafka-client/core';
|
|
2241
|
+
import { z } from 'zod';
|
|
2242
|
+
|
|
2243
|
+
const OrderCreated = topic('order.created').schema(
|
|
2244
|
+
registrySchema(registry, 'order.created-value', {
|
|
2245
|
+
validator: z.object({ orderId: z.string() }), // local runtime shape check
|
|
2246
|
+
enforceVersion: true, // default — fail loudly if the message version outruns the registry
|
|
2247
|
+
}),
|
|
2248
|
+
);
|
|
2249
|
+
```
|
|
2250
|
+
|
|
2251
|
+
The division of labour: the **registry governs schema evolution** (compatibility across versions); the **local validator governs runtime shape**. When `enforceVersion` is `true` (the default) a producer publishing a version newer than the latest registered version fails loudly rather than drifting silently.
|
|
2252
|
+
|
|
1694
2253
|
## Admin API
|
|
1695
2254
|
|
|
1696
2255
|
Inspect consumer groups, topic metadata, and delete records via the built-in admin client — no separate connection needed.
|
|
@@ -1735,6 +2294,34 @@ await kafka.deleteRecords('orders.created', [
|
|
|
1735
2294
|
|
|
1736
2295
|
Pass `offset: '-1'` to delete all records in a partition (truncate completely).
|
|
1737
2296
|
|
|
2297
|
+
## DLQ CLI
|
|
2298
|
+
|
|
2299
|
+
The package ships a `kafka-client-dlq` binary for inspecting and re-publishing dead letter queues from the terminal — no code needed. It operates on `<topic>.dlq` topics and delegates replay to `KafkaClient.replayDlq`:
|
|
2300
|
+
|
|
2301
|
+
```bash
|
|
2302
|
+
# List every .dlq topic with its message count (optionally filtered by base-topic prefix)
|
|
2303
|
+
kafka-client-dlq ls --brokers localhost:9092 [--prefix orders]
|
|
2304
|
+
|
|
2305
|
+
# Print up to N messages from <topic>.dlq — offset, x-dlq-* headers, and value
|
|
2306
|
+
kafka-client-dlq peek --brokers localhost:9092 --topic orders.created [--limit 5]
|
|
2307
|
+
|
|
2308
|
+
# Re-publish <topic>.dlq to its original topic (or --target), full or incremental
|
|
2309
|
+
kafka-client-dlq replay --brokers localhost:9092 --topic orders.created [--target orders.manual] [--dry-run] [--from-beginning | --incremental]
|
|
2310
|
+
```
|
|
2311
|
+
|
|
2312
|
+
| Flag | Command | Description |
|
|
2313
|
+
| ---- | ------- | ----------- |
|
|
2314
|
+
| `--brokers <list>` | all | Comma-separated broker addresses (**required**) |
|
|
2315
|
+
| `--prefix <name>` | `ls` | Only show DLQ topics whose base name starts with `<name>` |
|
|
2316
|
+
| `--topic <name>` | `peek`, `replay` | Base topic name — the CLI reads `<name>.dlq` |
|
|
2317
|
+
| `--limit <n>` | `peek` | Max messages to print (default `10`) |
|
|
2318
|
+
| `--target <t>` | `replay` | Override destination topic (default: `x-dlq-original-topic` header) |
|
|
2319
|
+
| `--dry-run` | `replay` | Count what would be replayed without publishing |
|
|
2320
|
+
| `--from-beginning` | `replay` | Full replay of all DLQ messages every call (default) |
|
|
2321
|
+
| `--incremental` | `replay` | Only messages added since the previous replay |
|
|
2322
|
+
|
|
2323
|
+
`--from-beginning` and `--incremental` are mutually exclusive. Run `kafka-client-dlq --help` (or with no arguments) for the full usage text.
|
|
2324
|
+
|
|
1738
2325
|
## Graceful shutdown
|
|
1739
2326
|
|
|
1740
2327
|
`disconnect()` now drains in-flight handlers before tearing down connections — no messages are silently cut off mid-processing.
|
|
@@ -1899,6 +2486,20 @@ If the handler hasn't resolved within the window, a `warn` is logged:
|
|
|
1899
2486
|
|
|
1900
2487
|
The handler is **not** cancelled — the warning is diagnostic only. Combine with `retry` to automatically give up after a fixed number of slow attempts.
|
|
1901
2488
|
|
|
2489
|
+
## Static group membership
|
|
2490
|
+
|
|
2491
|
+
Set `groupInstanceId` in `ConsumerOptions` to give a consumer a **static** identity (`group.instance.id`). A member that restarts within the broker's `session.timeout.ms` rejoins the group with the same partition assignment and triggers **no rebalance** — ideal for Kubernetes rolling restarts and short redeploys where a transient rebalance would otherwise stall every consumer in the group:
|
|
2492
|
+
|
|
2493
|
+
```typescript
|
|
2494
|
+
await kafka.startConsumer(['orders'], handler, {
|
|
2495
|
+
groupInstanceId: `orders-svc-${process.env.HOSTNAME}`,
|
|
2496
|
+
});
|
|
2497
|
+
```
|
|
2498
|
+
|
|
2499
|
+
The id must be **unique per member** within the consumer group — derive it from a stable per-pod value such as the StatefulSet ordinal or hostname. Two live members sharing the same `groupInstanceId` are fenced by the broker.
|
|
2500
|
+
|
|
2501
|
+
`groupInstanceId` is applied only to the consumer you set it on. It is **not** propagated to retry-chain companion consumers — those run in their own groups (`<groupId>-retry.N`) and rebalance independently. It can also be supplied via the `KAFKA_CONSUMER_GROUP_INSTANCE_ID` environment variable (see [Environment configuration](#environment-configuration)).
|
|
2502
|
+
|
|
1902
2503
|
## Schema validation
|
|
1903
2504
|
|
|
1904
2505
|
Add runtime message validation using any library with a `.parse()` method — Zod, Valibot, ArkType, or a custom validator. No extra dependency required.
|
|
@@ -2019,6 +2620,70 @@ interface SchemaParseContext {
|
|
|
2019
2620
|
|
|
2020
2621
|
Existing validators (Zod, Valibot, ArkType, custom) that only use the first argument continue to work unchanged — the second argument is silently ignored.
|
|
2021
2622
|
|
|
2623
|
+
### Versioned schemas
|
|
2624
|
+
|
|
2625
|
+
`versionedSchema()` composes per-version validators into a single `SchemaLike` that dispatches on the message's `x-schema-version` header (via `SchemaParseContext.version`). Pass a map of version number → validator, plus an optional `migrate` hook that upgrades older shapes to the latest:
|
|
2626
|
+
|
|
2627
|
+
```typescript
|
|
2628
|
+
import { topic, versionedSchema } from '@drarzter/kafka-client';
|
|
2629
|
+
import { z } from 'zod';
|
|
2630
|
+
|
|
2631
|
+
const OrderSchema = versionedSchema<{ orderId: string; amountMinor: number }>(
|
|
2632
|
+
{
|
|
2633
|
+
1: z.object({ orderId: z.string(), amount: z.number() }), // legacy: major units
|
|
2634
|
+
2: z.object({ orderId: z.string(), amountMinor: z.number().int() }), // current: minor units
|
|
2635
|
+
},
|
|
2636
|
+
{
|
|
2637
|
+
// migrate(data, fromVersion, latestVersion) → data in its latest shape
|
|
2638
|
+
migrate: (data, from) =>
|
|
2639
|
+
from === 1
|
|
2640
|
+
? { orderId: data.orderId, amountMinor: Math.round(data.amount * 100) }
|
|
2641
|
+
: data,
|
|
2642
|
+
},
|
|
2643
|
+
);
|
|
2644
|
+
|
|
2645
|
+
const OrderCreated = topic('order.created').schema(OrderSchema);
|
|
2646
|
+
```
|
|
2647
|
+
|
|
2648
|
+
Dispatch rules:
|
|
2649
|
+
|
|
2650
|
+
- **Consume path** — the version comes from the `x-schema-version` header (defaults to `1` when absent).
|
|
2651
|
+
- **Send path** — the version comes from `SendOptions.schemaVersion` (defaults to `1`).
|
|
2652
|
+
- **No parse context** (a direct `schema.parse(data)` call) — the **latest** registered version is assumed.
|
|
2653
|
+
|
|
2654
|
+
After a non-latest version is parsed, `migrate` (if provided) is called so your handler always receives the latest shape. Without a `migrate` hook, older versions are returned as parsed and callers must handle shape differences themselves.
|
|
2655
|
+
|
|
2656
|
+
A message carrying a version with **no registered schema throws** — the error lists every registered version rather than validating against the wrong shape, so a misconfigured producer fails loudly:
|
|
2657
|
+
|
|
2658
|
+
```text
|
|
2659
|
+
versionedSchema: no schema registered for version 3 (topic "order.created") — registered versions: 1, 2
|
|
2660
|
+
```
|
|
2661
|
+
|
|
2662
|
+
## Constructor options validation
|
|
2663
|
+
|
|
2664
|
+
The `KafkaClient` constructor validates its arguments up front. If anything is invalid it throws a **single aggregated error** listing every problem at once, so a misconfigured client fails at construction with a clear message instead of surfacing a confusing driver error on first use:
|
|
2665
|
+
|
|
2666
|
+
```typescript
|
|
2667
|
+
new KafkaClient('', '', [], { numPartitions: 0 });
|
|
2668
|
+
// throws:
|
|
2669
|
+
// KafkaClient: invalid configuration:
|
|
2670
|
+
// - clientId must be a non-empty string
|
|
2671
|
+
// - groupId must be a non-empty string
|
|
2672
|
+
// - brokers must be a non-empty array of broker addresses
|
|
2673
|
+
// - numPartitions must be a positive integer (got 0)
|
|
2674
|
+
```
|
|
2675
|
+
|
|
2676
|
+
Checks performed:
|
|
2677
|
+
|
|
2678
|
+
- `clientId` and `groupId` must be non-empty strings.
|
|
2679
|
+
- `brokers` must be a non-empty array with no empty entries — **unless** a custom `transport` is supplied (e.g. `FakeTransport` in tests), in which case an empty `brokers` array is allowed since no broker is dialled.
|
|
2680
|
+
- `numPartitions`, when set, must be a positive integer.
|
|
2681
|
+
- `transactionalId`, when set, must be non-empty.
|
|
2682
|
+
- `clockRecovery.topics` must be an array; `clockRecovery.timeoutMs`, when set, must be `> 0`.
|
|
2683
|
+
- `lagThrottle.maxLag` must be `>= 0`; `lagThrottle.pollIntervalMs` must be `> 0`; `lagThrottle.maxWaitMs` must be `>= 0` (each validated only when set).
|
|
2684
|
+
|
|
2685
|
+
This applies to both `new KafkaClient(...)` and `KafkaModule.register()` / `registerAsync()`, which construct the client under the hood.
|
|
2686
|
+
|
|
2022
2687
|
## Health check
|
|
2023
2688
|
|
|
2024
2689
|
Monitor Kafka connectivity with the built-in health indicator:
|
|
@@ -2129,6 +2794,26 @@ The integration suite spins up a single-node KRaft Kafka container and tests sen
|
|
|
2129
2794
|
|
|
2130
2795
|
Both suites run in CI on every push to `main` and on pull requests.
|
|
2131
2796
|
|
|
2797
|
+
**Chaos suite** — fault-injection tests (broker restarts, forced rebalances) that verify redelivery and offset-commit guarantees under failure:
|
|
2798
|
+
|
|
2799
|
+
```bash
|
|
2800
|
+
npm run test:chaos
|
|
2801
|
+
```
|
|
2802
|
+
|
|
2803
|
+
**Benchmark** — measure the wrapper's overhead over the raw driver:
|
|
2804
|
+
|
|
2805
|
+
```bash
|
|
2806
|
+
npm run bench
|
|
2807
|
+
```
|
|
2808
|
+
|
|
2809
|
+
The throughput benchmark reports roughly **~2% overhead** versus using `@confluentinc/kafka-javascript` directly — the typed envelope, Lamport clock, and instrumentation hooks cost very little on the hot path.
|
|
2810
|
+
|
|
2811
|
+
**Clean up stray containers** — if a Testcontainers run is interrupted, remove leftover containers:
|
|
2812
|
+
|
|
2813
|
+
```bash
|
|
2814
|
+
npm run containers:clean
|
|
2815
|
+
```
|
|
2816
|
+
|
|
2132
2817
|
## File naming conventions
|
|
2133
2818
|
|
|
2134
2819
|
Hyphens within a multi-word name; dot separates the name from its role suffix.
|