jetstream_bridge 5.0.0 → 5.0.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/README.md +5 -4
- data/docs/ARCHITECTURE.md +1135 -0
- data/lib/jetstream_bridge/consumer/consumer.rb +1 -1
- data/lib/jetstream_bridge/version.rb +1 -1
- metadata +2 -1
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: 6b75d900f4ede2b0dbc787641f532cfcbe07619435eaf3a1673b163d8f87658b
|
|
4
|
+
data.tar.gz: 7295f58824d037c799c6aa3487b650fe3030b50acd24fe858878dfbafa8c3cba
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: 5788fd8947c2d9d3f6e7dc9c7da0bd9fe072c9682fb13a220b12977e4c53e3ec9b373917f96a7ee415488b3081c3dd939afedd56a9ec98f324e0e31919a48106
|
|
7
|
+
data.tar.gz: a8223affe27e1cf99bf2db07bf8672b49ecece81e410b40f3727366813ba6d39e18ae828052b6c9cf68693354af2eb382a3566338fb4312ceb78008c2a993b4b
|
data/README.md
CHANGED
|
@@ -62,10 +62,11 @@ consumer.run!
|
|
|
62
62
|
|
|
63
63
|
## Documentation
|
|
64
64
|
|
|
65
|
-
- [Getting Started](docs/GETTING_STARTED.md)
|
|
66
|
-
- [
|
|
67
|
-
- [
|
|
68
|
-
- [
|
|
65
|
+
- [Getting Started](docs/GETTING_STARTED.md) - Setup, configuration, and basic usage
|
|
66
|
+
- [Architecture & Topology](docs/ARCHITECTURE.md) - Internal architecture, message flow, and patterns
|
|
67
|
+
- [Production Guide](docs/PRODUCTION.md) - Production deployment and monitoring
|
|
68
|
+
- [Restricted Permissions & Provisioning](docs/RESTRICTED_PERMISSIONS.md) - Manual provisioning and security
|
|
69
|
+
- [Testing with Mock NATS](docs/TESTING.md) - Fast, no-infra testing
|
|
69
70
|
|
|
70
71
|
## License
|
|
71
72
|
|
|
@@ -0,0 +1,1135 @@
|
|
|
1
|
+
# Architecture & Topology
|
|
2
|
+
|
|
3
|
+
This document explains the internal architecture, topology patterns, and message flow of JetStream Bridge.
|
|
4
|
+
|
|
5
|
+
## Table of Contents
|
|
6
|
+
|
|
7
|
+
- [Overview](#overview)
|
|
8
|
+
- [Core Components](#core-components)
|
|
9
|
+
- [Topology Model](#topology-model)
|
|
10
|
+
- [Subject Naming & Routing](#subject-naming--routing)
|
|
11
|
+
- [Message Flow](#message-flow)
|
|
12
|
+
- [Reliability Patterns](#reliability-patterns)
|
|
13
|
+
- [Consumer Modes](#consumer-modes)
|
|
14
|
+
- [Configuration & Lifecycle](#configuration--lifecycle)
|
|
15
|
+
- [Error Handling](#error-handling)
|
|
16
|
+
- [Thread Safety](#thread-safety)
|
|
17
|
+
|
|
18
|
+
---
|
|
19
|
+
|
|
20
|
+
## Overview
|
|
21
|
+
|
|
22
|
+
JetStream Bridge provides reliable, production-ready message passing between Ruby/Rails services using NATS JetStream. The architecture is designed around:
|
|
23
|
+
|
|
24
|
+
1. **Single stream per application pair** - One JetStream stream handles bidirectional communication
|
|
25
|
+
2. **Durable consumers** - Each application has a durable consumer (`{app_name}-workers`)
|
|
26
|
+
3. **Subject-based routing** - Messages routed via subjects: `{source}.sync.{destination}`
|
|
27
|
+
4. **Optional reliability patterns** - Outbox (publisher), Inbox (consumer), and DLQ (both)
|
|
28
|
+
5. **Flexible deployment** - Pull or push consumers, with auto or manual provisioning
|
|
29
|
+
|
|
30
|
+
### Architecture Diagram
|
|
31
|
+
|
|
32
|
+
```markdown
|
|
33
|
+
┌─────────────────┐ ┌─────────────────┐
|
|
34
|
+
│ Application │ │ Application │
|
|
35
|
+
│ "api" │ │ "worker" │
|
|
36
|
+
├─────────────────┤ ├─────────────────┤
|
|
37
|
+
│ │ │ │
|
|
38
|
+
│ Publisher │ │ Publisher │
|
|
39
|
+
│ (Optional │ │ (Optional │
|
|
40
|
+
│ Outbox) │ │ Outbox) │
|
|
41
|
+
│ │ │ │
|
|
42
|
+
└────────┬────────┘ └────────┬────────┘
|
|
43
|
+
│ │
|
|
44
|
+
│ Publish to: │ Publish to:
|
|
45
|
+
│ api.sync.worker │ worker.sync.api
|
|
46
|
+
│ │
|
|
47
|
+
└──────────────┐ ┌────────────┘
|
|
48
|
+
│ │
|
|
49
|
+
▼ ▼
|
|
50
|
+
┌──────────────────────┐
|
|
51
|
+
│ NATS JetStream │
|
|
52
|
+
│ │
|
|
53
|
+
│ Stream: "my-stream" │
|
|
54
|
+
│ Subjects: │
|
|
55
|
+
│ - api.sync.worker │
|
|
56
|
+
│ - worker.sync.api │
|
|
57
|
+
│ - api.sync.dlq │
|
|
58
|
+
│ - worker.sync.dlq │
|
|
59
|
+
│ │
|
|
60
|
+
│ Consumers: │
|
|
61
|
+
│ - api-workers │
|
|
62
|
+
│ - worker-workers │
|
|
63
|
+
└──────────────────────┘
|
|
64
|
+
│ │
|
|
65
|
+
┌──────────────┘ └────────────┐
|
|
66
|
+
│ │
|
|
67
|
+
│ Subscribe to: │ Subscribe to:
|
|
68
|
+
│ worker.sync.api │ api.sync.worker
|
|
69
|
+
│ │
|
|
70
|
+
┌────────▼────────┐ ┌────────▼────────┐
|
|
71
|
+
│ Consumer │ │ Consumer │
|
|
72
|
+
│ (Optional │ │ (Optional │
|
|
73
|
+
│ Inbox) │ │ Inbox) │
|
|
74
|
+
│ │ │ │
|
|
75
|
+
│ DLQ Handler │ │ DLQ Handler │
|
|
76
|
+
└─────────────────┘ └─────────────────┘
|
|
77
|
+
```
|
|
78
|
+
|
|
79
|
+
---
|
|
80
|
+
|
|
81
|
+
## Core Components
|
|
82
|
+
|
|
83
|
+
### Connection (`lib/jetstream_bridge/core/connection.rb`)
|
|
84
|
+
|
|
85
|
+
Thread-safe singleton managing NATS connections:
|
|
86
|
+
|
|
87
|
+
- Validates NATS URLs and JetStream availability
|
|
88
|
+
- Automatic reconnection with configurable retry logic
|
|
89
|
+
- Health check API with caching (30s TTL)
|
|
90
|
+
- Reconnect handlers for post-fork scenarios (Puma, Sidekiq)
|
|
91
|
+
- State management: disconnected → connecting → connected → reconnecting
|
|
92
|
+
|
|
93
|
+
**Key Methods:**
|
|
94
|
+
|
|
95
|
+
- `Connection.instance` - Get singleton connection
|
|
96
|
+
- `connection.connect!` - Establish NATS connection
|
|
97
|
+
- `connection.nats` - Access raw NATS client
|
|
98
|
+
- `connection.jetstream` - Access JetStream context
|
|
99
|
+
- `connection.healthy?` - Check connection health
|
|
100
|
+
- `connection.reconnect!` - Force reconnection
|
|
101
|
+
|
|
102
|
+
### Publisher (`lib/jetstream_bridge/publisher/publisher.rb`)
|
|
103
|
+
|
|
104
|
+
Publishes events to JetStream:
|
|
105
|
+
|
|
106
|
+
- Event envelope construction (event_id, timestamps, schema_version, etc.)
|
|
107
|
+
- Resource ID extraction from payload
|
|
108
|
+
- Optional outbox pattern for transactional guarantees
|
|
109
|
+
- Retry logic with exponential backoff
|
|
110
|
+
- Duplicate detection via NATS message ID header
|
|
111
|
+
|
|
112
|
+
**Usage:**
|
|
113
|
+
|
|
114
|
+
```ruby
|
|
115
|
+
JetstreamBridge.publish(
|
|
116
|
+
resource_type: "user",
|
|
117
|
+
event_type: "user.created",
|
|
118
|
+
payload: { id: 1, email: "user@example.com" }
|
|
119
|
+
)
|
|
120
|
+
```
|
|
121
|
+
|
|
122
|
+
**Envelope Structure:**
|
|
123
|
+
|
|
124
|
+
```json
|
|
125
|
+
{
|
|
126
|
+
"event_id": "uuid",
|
|
127
|
+
"event_type": "user.created",
|
|
128
|
+
"resource_type": "user",
|
|
129
|
+
"resource_id": "1",
|
|
130
|
+
"payload": { "id": 1, "email": "user@example.com" },
|
|
131
|
+
"produced_at": "2024-01-01T00:00:00Z",
|
|
132
|
+
"producer": "api",
|
|
133
|
+
"schema_version": "1.0",
|
|
134
|
+
"trace_id": "trace-uuid"
|
|
135
|
+
}
|
|
136
|
+
```
|
|
137
|
+
|
|
138
|
+
### Consumer (`lib/jetstream_bridge/consumer/consumer.rb`)
|
|
139
|
+
|
|
140
|
+
Subscribes to and processes messages:
|
|
141
|
+
|
|
142
|
+
- Durable consumer creation and subscription binding
|
|
143
|
+
- Batch fetching (pull mode) or delivery subject subscription (push mode)
|
|
144
|
+
- Message parsing and handler invocation
|
|
145
|
+
- Optional inbox pattern for exactly-once processing
|
|
146
|
+
- Automatic DLQ routing for unrecoverable errors
|
|
147
|
+
- Graceful shutdown with signal handlers
|
|
148
|
+
|
|
149
|
+
**Usage:**
|
|
150
|
+
|
|
151
|
+
```ruby
|
|
152
|
+
consumer = JetstreamBridge::Consumer.new do |event|
|
|
153
|
+
User.upsert({
|
|
154
|
+
id: event.resource_id,
|
|
155
|
+
email: event.payload["email"]
|
|
156
|
+
})
|
|
157
|
+
end
|
|
158
|
+
|
|
159
|
+
consumer.run! # Blocks and processes messages
|
|
160
|
+
```
|
|
161
|
+
|
|
162
|
+
### Provisioner (`lib/jetstream_bridge/provisioner.rb`)
|
|
163
|
+
|
|
164
|
+
Creates and updates JetStream streams and consumers:
|
|
165
|
+
|
|
166
|
+
- Stream creation with work-queue retention
|
|
167
|
+
- Subject management and overlap detection
|
|
168
|
+
- Consumer creation with delivery policies
|
|
169
|
+
- Idempotent operations (safe to re-run)
|
|
170
|
+
- Can run at deploy-time with admin credentials or at runtime
|
|
171
|
+
|
|
172
|
+
**Stream Configuration:**
|
|
173
|
+
|
|
174
|
+
```ruby
|
|
175
|
+
{
|
|
176
|
+
name: "jetstream-bridge-stream",
|
|
177
|
+
retention: :workqueue,
|
|
178
|
+
storage: :file,
|
|
179
|
+
subjects: [
|
|
180
|
+
"api.sync.worker",
|
|
181
|
+
"worker.sync.api",
|
|
182
|
+
"api.sync.dlq",
|
|
183
|
+
"worker.sync.dlq"
|
|
184
|
+
]
|
|
185
|
+
}
|
|
186
|
+
```
|
|
187
|
+
|
|
188
|
+
**Consumer Configuration:**
|
|
189
|
+
|
|
190
|
+
```ruby
|
|
191
|
+
{
|
|
192
|
+
durable_name: "api-workers",
|
|
193
|
+
filter_subject: "worker.sync.api",
|
|
194
|
+
ack_policy: :explicit,
|
|
195
|
+
deliver_policy: :all,
|
|
196
|
+
max_deliver: 5,
|
|
197
|
+
ack_wait: 30_000_000_000, # 30s in nanoseconds
|
|
198
|
+
backoff: [1_000_000_000, 5_000_000_000, ...]
|
|
199
|
+
}
|
|
200
|
+
```
|
|
201
|
+
|
|
202
|
+
### Topology (`lib/jetstream_bridge/topology/topology.rb`)
|
|
203
|
+
|
|
204
|
+
Orchestrates stream and consumer provisioning:
|
|
205
|
+
|
|
206
|
+
- Overlap guard prevents subject conflicts between streams
|
|
207
|
+
- Stream support retries when conflicts occur
|
|
208
|
+
- Validates and normalizes subjects
|
|
209
|
+
- Coordinates provisioning across components
|
|
210
|
+
|
|
211
|
+
---
|
|
212
|
+
|
|
213
|
+
## Topology Model
|
|
214
|
+
|
|
215
|
+
### Stream Structure
|
|
216
|
+
|
|
217
|
+
**One stream per application pair** (or shared stream for multiple apps):
|
|
218
|
+
|
|
219
|
+
```markdown
|
|
220
|
+
Stream: "jetstream-bridge-stream"
|
|
221
|
+
├── Subjects:
|
|
222
|
+
│ ├── api.sync.worker (api publishes, worker consumes)
|
|
223
|
+
│ ├── worker.sync.api (worker publishes, api consumes)
|
|
224
|
+
│ ├── api.sync.dlq (api's dead letter queue)
|
|
225
|
+
│ └── worker.sync.dlq (worker's dead letter queue)
|
|
226
|
+
│
|
|
227
|
+
├── Retention: workqueue (messages deleted after ack)
|
|
228
|
+
├── Storage: file (persistent on disk)
|
|
229
|
+
└── Consumers:
|
|
230
|
+
├── api-workers (filters: worker.sync.api)
|
|
231
|
+
└── worker-workers (filters: api.sync.worker)
|
|
232
|
+
```
|
|
233
|
+
|
|
234
|
+
### Subject Pattern
|
|
235
|
+
|
|
236
|
+
**IMPORTANT:** `app_name` should not include environment identifiers (e.g., use "api" not "api-production"). Consumer names are shared across environments.
|
|
237
|
+
|
|
238
|
+
#### Source Subject (Publisher)
|
|
239
|
+
|
|
240
|
+
```ruby
|
|
241
|
+
{app_name}.sync.{destination_app}
|
|
242
|
+
```
|
|
243
|
+
|
|
244
|
+
Example: `api.sync.worker`
|
|
245
|
+
|
|
246
|
+
#### Destination Subject (Consumer)
|
|
247
|
+
|
|
248
|
+
```ruby
|
|
249
|
+
{destination_app}.sync.{app_name}
|
|
250
|
+
```
|
|
251
|
+
|
|
252
|
+
Example: `worker.sync.api` (reverse of source)
|
|
253
|
+
|
|
254
|
+
#### DLQ Subject
|
|
255
|
+
|
|
256
|
+
```ruby
|
|
257
|
+
{app_name}.sync.dlq
|
|
258
|
+
```
|
|
259
|
+
|
|
260
|
+
Example: `api.sync.dlq`
|
|
261
|
+
|
|
262
|
+
### Consumer Naming
|
|
263
|
+
|
|
264
|
+
**Durable consumer name:**
|
|
265
|
+
|
|
266
|
+
```ruby
|
|
267
|
+
{app_name}-workers
|
|
268
|
+
```
|
|
269
|
+
|
|
270
|
+
Example: `api-workers`
|
|
271
|
+
|
|
272
|
+
**Filter subject:**
|
|
273
|
+
|
|
274
|
+
```ruby
|
|
275
|
+
{destination_app}.sync.{app_name}
|
|
276
|
+
```
|
|
277
|
+
|
|
278
|
+
Example: `worker.sync.api`
|
|
279
|
+
|
|
280
|
+
---
|
|
281
|
+
|
|
282
|
+
## Subject Naming & Routing
|
|
283
|
+
|
|
284
|
+
### Subject Validation
|
|
285
|
+
|
|
286
|
+
Subjects must:
|
|
287
|
+
|
|
288
|
+
- Not contain NATS wildcards (`*`, `>`)
|
|
289
|
+
- Not contain spaces or control characters
|
|
290
|
+
- Not exceed 255 characters
|
|
291
|
+
- Use valid characters: alphanumeric, hyphen, underscore, period
|
|
292
|
+
|
|
293
|
+
### Subject Matching
|
|
294
|
+
|
|
295
|
+
JetStream Bridge implements NATS wildcard matching:
|
|
296
|
+
|
|
297
|
+
- `*` - Matches exactly one token (e.g., `api.*.worker` matches `api.sync.worker`)
|
|
298
|
+
- `>` - Matches one or more tokens (e.g., `api.>` matches `api.sync.worker`)
|
|
299
|
+
|
|
300
|
+
### Overlap Detection
|
|
301
|
+
|
|
302
|
+
The `OverlapGuard` prevents subject conflicts:
|
|
303
|
+
|
|
304
|
+
```ruby
|
|
305
|
+
# Existing stream has: "orders.>"
|
|
306
|
+
# New stream wants: "orders.created"
|
|
307
|
+
# Result: OVERLAP - "orders.created" would be captured by "orders.>"
|
|
308
|
+
```
|
|
309
|
+
|
|
310
|
+
Overlap detection ensures messages route to exactly one stream.
|
|
311
|
+
|
|
312
|
+
---
|
|
313
|
+
|
|
314
|
+
## Message Flow
|
|
315
|
+
|
|
316
|
+
### Publishing Flow
|
|
317
|
+
|
|
318
|
+
```markdown
|
|
319
|
+
┌──────────────────────────────────────────────────────────────┐
|
|
320
|
+
│ 1. Application calls JetstreamBridge.publish(...) │
|
|
321
|
+
└──────────────────────┬───────────────────────────────────────┘
|
|
322
|
+
│
|
|
323
|
+
▼
|
|
324
|
+
┌──────────────────────────────────────────────────────────────┐
|
|
325
|
+
│ 2. Publisher builds envelope │
|
|
326
|
+
│ - Generate event_id (UUID) │
|
|
327
|
+
│ - Extract resource_id from payload │
|
|
328
|
+
│ - Add timestamps, producer, schema_version │
|
|
329
|
+
└──────────────────────┬───────────────────────────────────────┘
|
|
330
|
+
│
|
|
331
|
+
▼
|
|
332
|
+
┌──────────────────────────────────────────────────────────────┐
|
|
333
|
+
│ 3. [OPTIONAL] Outbox pattern │
|
|
334
|
+
│ - OutboxRepository.persist_pre() │
|
|
335
|
+
│ - State: "publishing" │
|
|
336
|
+
│ - Database transaction commits │
|
|
337
|
+
└──────────────────────┬───────────────────────────────────────┘
|
|
338
|
+
│
|
|
339
|
+
▼
|
|
340
|
+
┌──────────────────────────────────────────────────────────────┐
|
|
341
|
+
│ 4. Publish to NATS JetStream │
|
|
342
|
+
│ - Subject: {app_name}.sync.{destination_app} │
|
|
343
|
+
│ - Header: nats-msg-id = event_id (deduplication) │
|
|
344
|
+
│ - Retry with exponential backoff on transient errors │
|
|
345
|
+
└──────────────────────┬───────────────────────────────────────┘
|
|
346
|
+
│
|
|
347
|
+
▼
|
|
348
|
+
┌──────────────────────────────────────────────────────────────┐
|
|
349
|
+
│ 5. [OPTIONAL] Outbox update │
|
|
350
|
+
│ - Success: OutboxRepository.persist_success() │
|
|
351
|
+
│ - Failure: OutboxRepository.persist_failure() │
|
|
352
|
+
└──────────────────────┬───────────────────────────────────────┘
|
|
353
|
+
│
|
|
354
|
+
▼
|
|
355
|
+
┌──────────────────────────────────────────────────────────────┐
|
|
356
|
+
│ 6. Return PublishResult │
|
|
357
|
+
│ - success: true/false │
|
|
358
|
+
│ - event_id: UUID │
|
|
359
|
+
│ - duplicate: true/false (if seen before) │
|
|
360
|
+
└──────────────────────────────────────────────────────────────┘
|
|
361
|
+
```
|
|
362
|
+
|
|
363
|
+
### Consuming Flow
|
|
364
|
+
|
|
365
|
+
```markdown
|
|
366
|
+
┌──────────────────────────────────────────────────────────────┐
|
|
367
|
+
│ 1. Application creates Consumer.new { |event| ... } │
|
|
368
|
+
└──────────────────────┬───────────────────────────────────────┘
|
|
369
|
+
│
|
|
370
|
+
▼
|
|
371
|
+
┌──────────────────────────────────────────────────────────────┐
|
|
372
|
+
│ 2. SubscriptionManager ensures durable consumer │
|
|
373
|
+
│ - Consumer: {app_name}-workers │
|
|
374
|
+
│ - Filter: {destination_app}.sync.{app_name} │
|
|
375
|
+
│ - Create if not exists (idempotent) │
|
|
376
|
+
└──────────────────────┬───────────────────────────────────────┘
|
|
377
|
+
│
|
|
378
|
+
▼
|
|
379
|
+
┌──────────────────────────────────────────────────────────────┐
|
|
380
|
+
│ 3. Subscribe to consumer │
|
|
381
|
+
│ - Pull mode: $JS.API.CONSUMER.MSG.NEXT.{stream}.{durable}│
|
|
382
|
+
│ - Push mode: {delivery_subject} │
|
|
383
|
+
└──────────────────────┬───────────────────────────────────────┘
|
|
384
|
+
│
|
|
385
|
+
▼
|
|
386
|
+
┌──────────────────────────────────────────────────────────────┐
|
|
387
|
+
│ 4. Consumer.run! starts main loop │
|
|
388
|
+
│ - Fetch batch of messages │
|
|
389
|
+
│ - Process each message sequentially │
|
|
390
|
+
│ - Idle backoff when no messages (0.05s → 1.0s) │
|
|
391
|
+
└──────────────────────┬───────────────────────────────────────┘
|
|
392
|
+
│
|
|
393
|
+
▼
|
|
394
|
+
┌──────────────────────────────────────────────────────────────┐
|
|
395
|
+
│ 5. [OPTIONAL] Inbox deduplication check │
|
|
396
|
+
│ - InboxRepository.find_or_build(event_id) │
|
|
397
|
+
│ - If already processed → skip and ack │
|
|
398
|
+
│ - If new → InboxRepository.persist_pre() │
|
|
399
|
+
│ - State: "processing" │
|
|
400
|
+
└──────────────────────┬───────────────────────────────────────┘
|
|
401
|
+
│
|
|
402
|
+
▼
|
|
403
|
+
┌──────────────────────────────────────────────────────────────┐
|
|
404
|
+
│ 6. MessageProcessor.handle_message() │
|
|
405
|
+
│ - Parse JSON envelope → Event object │
|
|
406
|
+
│ - Run middleware chain │
|
|
407
|
+
│ - Call user handler block │
|
|
408
|
+
│ - Return ActionResult (:ack or :nak) │
|
|
409
|
+
└──────────────────────┬───────────────────────────────────────┘
|
|
410
|
+
│
|
|
411
|
+
▼
|
|
412
|
+
┌──────────────────────────────────────────────────────────────┐
|
|
413
|
+
│ 7. Error handling │
|
|
414
|
+
│ - Unrecoverable (ArgumentError, TypeError) → DLQ + ack │
|
|
415
|
+
│ - Recoverable (StandardError) → nak with backoff │
|
|
416
|
+
│ - Malformed JSON → DLQ + ack │
|
|
417
|
+
└──────────────────────┬───────────────────────────────────────┘
|
|
418
|
+
│
|
|
419
|
+
▼
|
|
420
|
+
┌──────────────────────────────────────────────────────────────┐
|
|
421
|
+
│ 8. [OPTIONAL] Inbox update │
|
|
422
|
+
│ - Success: InboxRepository.persist_post() │
|
|
423
|
+
│ - Failure: InboxRepository.persist_failure() │
|
|
424
|
+
└──────────────────────┬───────────────────────────────────────┘
|
|
425
|
+
│
|
|
426
|
+
▼
|
|
427
|
+
┌──────────────────────────────────────────────────────────────┐
|
|
428
|
+
│ 9. Acknowledge message │
|
|
429
|
+
│ - :ack → msg.ack (removes from stream) │
|
|
430
|
+
│ - :nak → msg.nak(delay: backoff) (requeue for retry) │
|
|
431
|
+
└──────────────────────────────────────────────────────────────┘
|
|
432
|
+
```
|
|
433
|
+
|
|
434
|
+
---
|
|
435
|
+
|
|
436
|
+
## Reliability Patterns
|
|
437
|
+
|
|
438
|
+
### Outbox Pattern (Publisher Side)
|
|
439
|
+
|
|
440
|
+
**Purpose:** Guarantee at-least-once delivery by persisting events to database before publishing.
|
|
441
|
+
|
|
442
|
+
**Configuration:**
|
|
443
|
+
|
|
444
|
+
```ruby
|
|
445
|
+
config.use_outbox = true
|
|
446
|
+
config.outbox_model = 'JetstreamBridge::OutboxEvent'
|
|
447
|
+
```
|
|
448
|
+
|
|
449
|
+
**States:**
|
|
450
|
+
|
|
451
|
+
- `pending` - Event queued but not yet published
|
|
452
|
+
- `publishing` - Currently being published to NATS
|
|
453
|
+
- `sent` - Successfully published
|
|
454
|
+
- `failed` - Failed after retries
|
|
455
|
+
- `exception` - Unexpected error
|
|
456
|
+
|
|
457
|
+
**Recovery:**
|
|
458
|
+
|
|
459
|
+
```ruby
|
|
460
|
+
# Retry failed events via background job
|
|
461
|
+
JetstreamBridge::OutboxEvent.where(status: 'failed').find_each do |event|
|
|
462
|
+
JetstreamBridge.publish(
|
|
463
|
+
event_id: event.event_id,
|
|
464
|
+
resource_type: event.resource_type,
|
|
465
|
+
event_type: event.event_type,
|
|
466
|
+
payload: event.payload
|
|
467
|
+
)
|
|
468
|
+
end
|
|
469
|
+
```
|
|
470
|
+
|
|
471
|
+
### Inbox Pattern (Consumer Side)
|
|
472
|
+
|
|
473
|
+
**Purpose:** Guarantee exactly-once processing by tracking received events in database.
|
|
474
|
+
|
|
475
|
+
**Configuration:**
|
|
476
|
+
|
|
477
|
+
```ruby
|
|
478
|
+
config.use_inbox = true
|
|
479
|
+
config.inbox_model = 'JetstreamBridge::InboxEvent'
|
|
480
|
+
```
|
|
481
|
+
|
|
482
|
+
**States:**
|
|
483
|
+
|
|
484
|
+
- `received` - Event received but not yet processed
|
|
485
|
+
- `processing` - Currently being processed
|
|
486
|
+
- `processed` - Successfully processed
|
|
487
|
+
- `failed` - Failed processing
|
|
488
|
+
|
|
489
|
+
**Deduplication:**
|
|
490
|
+
|
|
491
|
+
- Uses `event_id` for primary deduplication
|
|
492
|
+
- Falls back to `stream_seq` if event_id not available
|
|
493
|
+
- Database row locking prevents concurrent processing
|
|
494
|
+
|
|
495
|
+
**Example:**
|
|
496
|
+
|
|
497
|
+
```ruby
|
|
498
|
+
# First delivery
|
|
499
|
+
inbox = InboxRepository.find_or_build(event_id: "abc123")
|
|
500
|
+
inbox.new_record? # => true
|
|
501
|
+
inbox.processed_at # => nil
|
|
502
|
+
# Process message...
|
|
503
|
+
|
|
504
|
+
# Second delivery (redelivery)
|
|
505
|
+
inbox = InboxRepository.find_or_build(event_id: "abc123")
|
|
506
|
+
inbox.new_record? # => false
|
|
507
|
+
inbox.processed_at # => 2024-01-01 00:00:00
|
|
508
|
+
# Skip processing, already done
|
|
509
|
+
```
|
|
510
|
+
|
|
511
|
+
### Dead Letter Queue (DLQ)
|
|
512
|
+
|
|
513
|
+
**Purpose:** Route unrecoverable messages to separate subject for manual intervention.
|
|
514
|
+
|
|
515
|
+
**Configuration:**
|
|
516
|
+
|
|
517
|
+
```ruby
|
|
518
|
+
config.use_dlq = true
|
|
519
|
+
```
|
|
520
|
+
|
|
521
|
+
**Triggered by:**
|
|
522
|
+
|
|
523
|
+
1. **Malformed JSON** - Cannot parse event envelope
|
|
524
|
+
2. **Max deliveries exceeded** - Message failed `config.max_deliver` times
|
|
525
|
+
3. **Unrecoverable errors** - ArgumentError, TypeError, NameError
|
|
526
|
+
|
|
527
|
+
**DLQ Message Headers:**
|
|
528
|
+
|
|
529
|
+
```json
|
|
530
|
+
{
|
|
531
|
+
"x-dead-letter": "true",
|
|
532
|
+
"x-dlq-reason": "max_deliveries_exceeded",
|
|
533
|
+
"x-deliveries": "5",
|
|
534
|
+
"x-dlq-context": {
|
|
535
|
+
"event_id": "abc123",
|
|
536
|
+
"error_class": "StandardError",
|
|
537
|
+
"error_message": "Something went wrong",
|
|
538
|
+
"original_subject": "worker.sync.api",
|
|
539
|
+
"stream_sequence": 42,
|
|
540
|
+
"consumer_sequence": 10,
|
|
541
|
+
"timestamp": "2024-01-01T00:00:00Z"
|
|
542
|
+
}
|
|
543
|
+
}
|
|
544
|
+
```
|
|
545
|
+
|
|
546
|
+
**DLQ Subject:**
|
|
547
|
+
|
|
548
|
+
```ruby
|
|
549
|
+
{app_name}.sync.dlq
|
|
550
|
+
```
|
|
551
|
+
|
|
552
|
+
**Monitoring DLQ:**
|
|
553
|
+
|
|
554
|
+
```bash
|
|
555
|
+
# View DLQ messages
|
|
556
|
+
nats sub 'api.sync.dlq'
|
|
557
|
+
|
|
558
|
+
# Check DLQ consumer
|
|
559
|
+
nats consumer info jetstream-bridge-stream api-dlq-consumer
|
|
560
|
+
```
|
|
561
|
+
|
|
562
|
+
### Retry Strategy
|
|
563
|
+
|
|
564
|
+
**Configuration:**
|
|
565
|
+
|
|
566
|
+
```ruby
|
|
567
|
+
config.max_deliver = 5 # Max retry attempts
|
|
568
|
+
config.ack_wait = '30s' # Time before JetStream redelivers
|
|
569
|
+
config.backoff = ['1s', '5s', '15s', '30s', '60s']
|
|
570
|
+
```
|
|
571
|
+
|
|
572
|
+
**Backoff Calculation:**
|
|
573
|
+
|
|
574
|
+
- Base delay: 0.5s (transient errors) or 2.0s (other errors)
|
|
575
|
+
- Exponential multiplier: 2^(attempt - 1)
|
|
576
|
+
- Min delay: 1 second
|
|
577
|
+
- Max delay: 60 seconds
|
|
578
|
+
|
|
579
|
+
**Example Timeline:**
|
|
580
|
+
|
|
581
|
+
```shell
|
|
582
|
+
Attempt 1 → Fail → NAK with delay 1s
|
|
583
|
+
Attempt 2 (1s later) → Fail → NAK with delay 5s
|
|
584
|
+
Attempt 3 (5s later) → Fail → NAK with delay 15s
|
|
585
|
+
Attempt 4 (15s later) → Fail → NAK with delay 30s
|
|
586
|
+
Attempt 5 (30s later) → Fail → NAK with delay 60s
|
|
587
|
+
Attempt 6 (60s later) → Fail → DLQ + ACK (max_deliver exceeded)
|
|
588
|
+
```
|
|
589
|
+
|
|
590
|
+
---
|
|
591
|
+
|
|
592
|
+
## Consumer Modes
|
|
593
|
+
|
|
594
|
+
### Pull Mode (Default)
|
|
595
|
+
|
|
596
|
+
**Configuration:**
|
|
597
|
+
|
|
598
|
+
```ruby
|
|
599
|
+
config.consumer_mode = :pull # default
|
|
600
|
+
```
|
|
601
|
+
|
|
602
|
+
**How it works:**
|
|
603
|
+
|
|
604
|
+
1. Consumer publishes request to `$JS.API.CONSUMER.MSG.NEXT.{stream}.{durable}`
|
|
605
|
+
2. JetStream responds with batch of messages (up to `batch_size`)
|
|
606
|
+
3. Consumer processes messages and requests next batch
|
|
607
|
+
|
|
608
|
+
**Advantages:**
|
|
609
|
+
|
|
610
|
+
- **Backpressure control** - Consumer pulls when ready
|
|
611
|
+
- **Restricted permissions** - No JetStream API access needed at runtime
|
|
612
|
+
- **Scalability** - Multiple workers pull at their own pace
|
|
613
|
+
|
|
614
|
+
**Message Fetch:**
|
|
615
|
+
|
|
616
|
+
```ruby
|
|
617
|
+
# Pull request
|
|
618
|
+
{
|
|
619
|
+
"batch": 10,
|
|
620
|
+
"max_bytes": 1048576, # 1MB
|
|
621
|
+
"idle_heartbeat": 5000000000 # 5s
|
|
622
|
+
}
|
|
623
|
+
```
|
|
624
|
+
|
|
625
|
+
**Use cases:**
|
|
626
|
+
|
|
627
|
+
- High-throughput processing
|
|
628
|
+
- Variable processing time per message
|
|
629
|
+
- Restricted production environments
|
|
630
|
+
- Multiple consumer instances
|
|
631
|
+
|
|
632
|
+
### Push Mode
|
|
633
|
+
|
|
634
|
+
**Configuration:**
|
|
635
|
+
|
|
636
|
+
```ruby
|
|
637
|
+
config.consumer_mode = :push
|
|
638
|
+
config.delivery_subject = 'worker.sync.api.worker' # optional
|
|
639
|
+
```
|
|
640
|
+
|
|
641
|
+
**How it works:**
|
|
642
|
+
|
|
643
|
+
1. JetStream automatically pushes messages to delivery subject
|
|
644
|
+
2. Consumer subscribes to delivery subject
|
|
645
|
+
3. Messages arrive as soon as available
|
|
646
|
+
|
|
647
|
+
**Advantages:**
|
|
648
|
+
|
|
649
|
+
- **Lower latency** - No request/response roundtrip
|
|
650
|
+
- **Simpler model** - Fire-and-forget from JetStream side
|
|
651
|
+
- **Good for real-time** - Immediate delivery
|
|
652
|
+
|
|
653
|
+
**Default Delivery Subject:**
|
|
654
|
+
|
|
655
|
+
```ruby
|
|
656
|
+
{destination_subject}.worker
|
|
657
|
+
```
|
|
658
|
+
|
|
659
|
+
Example: `worker.sync.api.worker`
|
|
660
|
+
|
|
661
|
+
**Use cases:**
|
|
662
|
+
|
|
663
|
+
- Low-latency requirements
|
|
664
|
+
- Event-driven architectures
|
|
665
|
+
- Moderate message volume
|
|
666
|
+
- Single consumer instance
|
|
667
|
+
|
|
668
|
+
### Comparison Table
|
|
669
|
+
|
|
670
|
+
| Feature | Pull Mode | Push Mode |
|
|
671
|
+
|---------|-----------|-----------|
|
|
672
|
+
| **Control** | Consumer-driven | Server-driven |
|
|
673
|
+
| **Latency** | Slightly higher (request/response) | Lower (immediate push) |
|
|
674
|
+
| **Backpressure** | Natural (consumer controls fetch) | Requires management |
|
|
675
|
+
| **Permissions** | Works with restricted permissions | Standard permissions |
|
|
676
|
+
| **Scalability** | Better for high throughput | Good for moderate load |
|
|
677
|
+
| **Complexity** | More API calls | Simpler |
|
|
678
|
+
| **Best for** | Batch processing, high volume | Real-time, low volume |
|
|
679
|
+
|
|
680
|
+
---
|
|
681
|
+
|
|
682
|
+
## Configuration & Lifecycle
|
|
683
|
+
|
|
684
|
+
### Configuration Flow
|
|
685
|
+
|
|
686
|
+
```ruby
|
|
687
|
+
# config/initializers/jetstream_bridge.rb
|
|
688
|
+
JetstreamBridge.configure do |config|
|
|
689
|
+
# Connection
|
|
690
|
+
config.nats_urls = ENV.fetch('NATS_URLS', 'nats://localhost:4222')
|
|
691
|
+
config.stream_name = 'jetstream-bridge-stream'
|
|
692
|
+
|
|
693
|
+
# Application identity (no environment suffix!)
|
|
694
|
+
config.app_name = 'api'
|
|
695
|
+
config.destination_app = 'worker'
|
|
696
|
+
|
|
697
|
+
# Reliability
|
|
698
|
+
config.use_outbox = true
|
|
699
|
+
config.use_inbox = true
|
|
700
|
+
config.use_dlq = true
|
|
701
|
+
|
|
702
|
+
# Consumer tuning
|
|
703
|
+
config.max_deliver = 5
|
|
704
|
+
config.ack_wait = '30s'
|
|
705
|
+
config.backoff = %w[1s 5s 15s 30s 60s]
|
|
706
|
+
|
|
707
|
+
# Consumer mode
|
|
708
|
+
config.consumer_mode = :pull # or :push
|
|
709
|
+
|
|
710
|
+
# Provisioning
|
|
711
|
+
config.auto_provision = true # false for restricted environments
|
|
712
|
+
|
|
713
|
+
# Connection management
|
|
714
|
+
config.connect_retry_attempts = 3
|
|
715
|
+
config.connect_retry_delay = 2
|
|
716
|
+
config.lazy_connect = false
|
|
717
|
+
end
|
|
718
|
+
```
|
|
719
|
+
|
|
720
|
+
### Startup Lifecycle
|
|
721
|
+
|
|
722
|
+
**Automatic (Rails):**
|
|
723
|
+
|
|
724
|
+
```ruby
|
|
725
|
+
# Railtie automatically calls after initialization:
|
|
726
|
+
JetstreamBridge.startup!
|
|
727
|
+
```
|
|
728
|
+
|
|
729
|
+
**Manual:**
|
|
730
|
+
|
|
731
|
+
```ruby
|
|
732
|
+
# Non-Rails or custom boot
|
|
733
|
+
JetstreamBridge.startup!
|
|
734
|
+
```
|
|
735
|
+
|
|
736
|
+
**Lazy Connect:**
|
|
737
|
+
|
|
738
|
+
```ruby
|
|
739
|
+
config.lazy_connect = true
|
|
740
|
+
# OR
|
|
741
|
+
ENV['JETSTREAM_BRIDGE_DISABLE_AUTOSTART'] = '1'
|
|
742
|
+
|
|
743
|
+
# Connection happens on first publish/subscribe
|
|
744
|
+
```
|
|
745
|
+
|
|
746
|
+
**Startup Steps:**
|
|
747
|
+
|
|
748
|
+
1. Validate configuration
|
|
749
|
+
2. Connect to NATS
|
|
750
|
+
3. Verify JetStream availability
|
|
751
|
+
4. Ensure stream topology (if `auto_provision=true`)
|
|
752
|
+
5. Cache connection for reuse
|
|
753
|
+
|
|
754
|
+
### Provisioning Modes
|
|
755
|
+
|
|
756
|
+
#### Auto Provisioning (Default)
|
|
757
|
+
|
|
758
|
+
**Configuration:**
|
|
759
|
+
|
|
760
|
+
```ruby
|
|
761
|
+
config.auto_provision = true
|
|
762
|
+
```
|
|
763
|
+
|
|
764
|
+
**Behavior:**
|
|
765
|
+
|
|
766
|
+
- Creates streams and consumers at runtime
|
|
767
|
+
- Requires JetStream API permissions
|
|
768
|
+
- Idempotent (safe to re-run)
|
|
769
|
+
|
|
770
|
+
#### Manual Provisioning
|
|
771
|
+
|
|
772
|
+
**Configuration:**
|
|
773
|
+
|
|
774
|
+
```ruby
|
|
775
|
+
config.auto_provision = false
|
|
776
|
+
```
|
|
777
|
+
|
|
778
|
+
**Provisioning at deploy time:**
|
|
779
|
+
|
|
780
|
+
```bash
|
|
781
|
+
# Using rake task with admin credentials
|
|
782
|
+
NATS_URLS=nats://admin:pass@host:4222 \
|
|
783
|
+
bundle exec rake jetstream_bridge:provision
|
|
784
|
+
```
|
|
785
|
+
|
|
786
|
+
**Benefits:**
|
|
787
|
+
|
|
788
|
+
- Runtime credentials don't need admin permissions
|
|
789
|
+
- Separate provisioning from application lifecycle
|
|
790
|
+
- Better security posture
|
|
791
|
+
|
|
792
|
+
See [RESTRICTED_PERMISSIONS.md](RESTRICTED_PERMISSIONS.md) for details.
|
|
793
|
+
|
|
794
|
+
### Reconnection Handling
|
|
795
|
+
|
|
796
|
+
**Automatic reconnection:**
|
|
797
|
+
|
|
798
|
+
```ruby
|
|
799
|
+
# NATS client auto-reconnects on network failures
|
|
800
|
+
# JetstreamBridge preserves JetStream context
|
|
801
|
+
```
|
|
802
|
+
|
|
803
|
+
**Manual reconnection:**
|
|
804
|
+
|
|
805
|
+
```ruby
|
|
806
|
+
# After forking (Puma, Sidekiq)
|
|
807
|
+
JetstreamBridge.reconnect!
|
|
808
|
+
|
|
809
|
+
# Example: Puma config
|
|
810
|
+
on_worker_boot do
|
|
811
|
+
JetstreamBridge.reconnect!
|
|
812
|
+
end
|
|
813
|
+
```
|
|
814
|
+
|
|
815
|
+
---
|
|
816
|
+
|
|
817
|
+
## Error Handling
|
|
818
|
+
|
|
819
|
+
### Error Categories
|
|
820
|
+
|
|
821
|
+
#### Unrecoverable Errors
|
|
822
|
+
|
|
823
|
+
**Types:**
|
|
824
|
+
|
|
825
|
+
- `ArgumentError` - Invalid arguments
|
|
826
|
+
- `TypeError` - Type mismatch
|
|
827
|
+
- `NameError` - Undefined constant/method
|
|
828
|
+
|
|
829
|
+
**Handling:**
|
|
830
|
+
|
|
831
|
+
1. Log error with full context
|
|
832
|
+
2. Publish to DLQ with `x-dlq-reason: unrecoverable_error`
|
|
833
|
+
3. ACK message (remove from stream)
|
|
834
|
+
|
|
835
|
+
#### Recoverable Errors
|
|
836
|
+
|
|
837
|
+
**Types:**
|
|
838
|
+
|
|
839
|
+
- `StandardError` (default)
|
|
840
|
+
- Transient failures (network, timeouts)
|
|
841
|
+
- Retryable business logic errors
|
|
842
|
+
|
|
843
|
+
**Handling:**
|
|
844
|
+
|
|
845
|
+
1. Log error with delivery count
|
|
846
|
+
2. NAK message with backoff delay
|
|
847
|
+
3. JetStream redelivers after delay
|
|
848
|
+
4. After `max_deliver` attempts → DLQ
|
|
849
|
+
|
|
850
|
+
#### Malformed Messages
|
|
851
|
+
|
|
852
|
+
**Types:**
|
|
853
|
+
|
|
854
|
+
- JSON parse errors
|
|
855
|
+
- Invalid envelope structure
|
|
856
|
+
|
|
857
|
+
**Handling:**
|
|
858
|
+
|
|
859
|
+
1. Log raw message data
|
|
860
|
+
2. Publish to DLQ with `x-dlq-reason: malformed_json`
|
|
861
|
+
3. ACK message (remove from stream)
|
|
862
|
+
|
|
863
|
+
### Error Context
|
|
864
|
+
|
|
865
|
+
**Logged Information:**
|
|
866
|
+
|
|
867
|
+
```ruby
|
|
868
|
+
{
|
|
869
|
+
error_class: "StandardError",
|
|
870
|
+
error_message: "Database connection lost",
|
|
871
|
+
event_id: "abc123",
|
|
872
|
+
resource_type: "user",
|
|
873
|
+
event_type: "user.created",
|
|
874
|
+
delivery_count: 3,
|
|
875
|
+
stream_sequence: 42,
|
|
876
|
+
consumer_sequence: 10,
|
|
877
|
+
subject: "worker.sync.api",
|
|
878
|
+
backtrace: [...]
|
|
879
|
+
}
|
|
880
|
+
```
|
|
881
|
+
|
|
882
|
+
### Custom Error Handling
|
|
883
|
+
|
|
884
|
+
**Middleware approach:**
|
|
885
|
+
|
|
886
|
+
```ruby
|
|
887
|
+
class CustomErrorHandler
|
|
888
|
+
def call(event, next_middleware)
|
|
889
|
+
next_middleware.call(event)
|
|
890
|
+
rescue CustomRetryableError => e
|
|
891
|
+
# Return ActionResult with custom delay
|
|
892
|
+
JetstreamBridge::Consumer::ActionResult.new(:nak, delay: 10)
|
|
893
|
+
rescue CustomPermanentError => e
|
|
894
|
+
# Log and move to DLQ
|
|
895
|
+
logger.error("Permanent error: #{e.message}")
|
|
896
|
+
publish_to_custom_dlq(event, e)
|
|
897
|
+
JetstreamBridge::Consumer::ActionResult.new(:ack)
|
|
898
|
+
end
|
|
899
|
+
end
|
|
900
|
+
|
|
901
|
+
consumer.use(CustomErrorHandler.new)
|
|
902
|
+
```
|
|
903
|
+
|
|
904
|
+
---
|
|
905
|
+
|
|
906
|
+
## Thread Safety
|
|
907
|
+
|
|
908
|
+
### Connection Singleton
|
|
909
|
+
|
|
910
|
+
**Thread-safe initialization:**
|
|
911
|
+
|
|
912
|
+
```ruby
|
|
913
|
+
@@connection_lock = Mutex.new
|
|
914
|
+
|
|
915
|
+
def self.instance
|
|
916
|
+
return @@connection if @@connection
|
|
917
|
+
|
|
918
|
+
@@connection_lock.synchronize do
|
|
919
|
+
@@connection ||= new
|
|
920
|
+
end
|
|
921
|
+
|
|
922
|
+
@@connection
|
|
923
|
+
end
|
|
924
|
+
```
|
|
925
|
+
|
|
926
|
+
**Health check cache:**
|
|
927
|
+
|
|
928
|
+
```ruby
|
|
929
|
+
# Thread-safe cache updates
|
|
930
|
+
@health_cache_lock.synchronize do
|
|
931
|
+
@health_cache = { data: health_data, cached_at: Time.now }
|
|
932
|
+
end
|
|
933
|
+
```
|
|
934
|
+
|
|
935
|
+
### Consumer Processing
|
|
936
|
+
|
|
937
|
+
**Single-threaded by design:**
|
|
938
|
+
|
|
939
|
+
- Fetch batch → Process sequentially → Fetch next batch
|
|
940
|
+
- No concurrent message processing within one consumer instance
|
|
941
|
+
- Multiple consumer instances for parallelism
|
|
942
|
+
|
|
943
|
+
**Inbox Row Locking:**
|
|
944
|
+
|
|
945
|
+
```ruby
|
|
946
|
+
# Prevents concurrent processing of same event_id
|
|
947
|
+
InboxEvent.lock.find_or_create_by!(event_id: event.event_id) do |inbox|
|
|
948
|
+
inbox.status = 'processing'
|
|
949
|
+
end
|
|
950
|
+
```
|
|
951
|
+
|
|
952
|
+
### Publisher
|
|
953
|
+
|
|
954
|
+
**Thread-safe publishing:**
|
|
955
|
+
|
|
956
|
+
- No global state mutation
|
|
957
|
+
- Independent envelope generation per call
|
|
958
|
+
- Outbox uses AR transactions for atomicity
|
|
959
|
+
|
|
960
|
+
**Concurrent publishing:**
|
|
961
|
+
|
|
962
|
+
```ruby
|
|
963
|
+
# Safe to call from multiple threads
|
|
964
|
+
threads = 10.times.map do |i|
|
|
965
|
+
Thread.new do
|
|
966
|
+
JetstreamBridge.publish(
|
|
967
|
+
resource_type: 'user',
|
|
968
|
+
event_type: 'user.created',
|
|
969
|
+
payload: { id: i }
|
|
970
|
+
)
|
|
971
|
+
end
|
|
972
|
+
end
|
|
973
|
+
|
|
974
|
+
threads.each(&:join)
|
|
975
|
+
```
|
|
976
|
+
|
|
977
|
+
### Best Practices
|
|
978
|
+
|
|
979
|
+
1. **One consumer per process** - Avoid multiple consumer loops in one process
|
|
980
|
+
2. **Fork safety** - Call `JetstreamBridge.reconnect!` after forking
|
|
981
|
+
3. **Database connections** - ActiveRecord handles connection pooling
|
|
982
|
+
4. **Signal handling** - Consumer handles INT/TERM for graceful shutdown
|
|
983
|
+
|
|
984
|
+
---
|
|
985
|
+
|
|
986
|
+
## Performance Considerations
|
|
987
|
+
|
|
988
|
+
### Batch Size
|
|
989
|
+
|
|
990
|
+
**Pull mode:**
|
|
991
|
+
|
|
992
|
+
```ruby
|
|
993
|
+
consumer = JetstreamBridge::Consumer.new(batch_size: 10) do |event|
|
|
994
|
+
# Process event
|
|
995
|
+
end
|
|
996
|
+
```
|
|
997
|
+
|
|
998
|
+
**Trade-offs:**
|
|
999
|
+
|
|
1000
|
+
- **Small batch (1-5):** Lower latency, more API calls
|
|
1001
|
+
- **Medium batch (10-50):** Balanced latency and throughput
|
|
1002
|
+
- **Large batch (50+):** Higher throughput, risk of processing timeouts
|
|
1003
|
+
|
|
1004
|
+
### Idle Backoff
|
|
1005
|
+
|
|
1006
|
+
**Exponential backoff when no messages:**
|
|
1007
|
+
|
|
1008
|
+
```markdown
|
|
1009
|
+
0.05s → 0.1s → 0.2s → 0.4s → 0.8s → 1.0s (max)
|
|
1010
|
+
```
|
|
1011
|
+
|
|
1012
|
+
**Benefit:** Reduces CPU and network usage during idle periods
|
|
1013
|
+
|
|
1014
|
+
### Connection Pooling
|
|
1015
|
+
|
|
1016
|
+
**Single connection per process:**
|
|
1017
|
+
|
|
1018
|
+
- NATS client maintains connection pool internally
|
|
1019
|
+
- JetStream context cached for reuse
|
|
1020
|
+
- No need for application-level pooling
|
|
1021
|
+
|
|
1022
|
+
### Memory Management
|
|
1023
|
+
|
|
1024
|
+
**Long-running consumers:**
|
|
1025
|
+
|
|
1026
|
+
- Periodic health checks every 10 minutes
|
|
1027
|
+
- Memory monitoring can be added via middleware
|
|
1028
|
+
- Graceful shutdown prevents memory leaks
|
|
1029
|
+
|
|
1030
|
+
---
|
|
1031
|
+
|
|
1032
|
+
## Observability
|
|
1033
|
+
|
|
1034
|
+
### Health Checks
|
|
1035
|
+
|
|
1036
|
+
```ruby
|
|
1037
|
+
health = JetstreamBridge.health_check(skip_cache: false)
|
|
1038
|
+
|
|
1039
|
+
{
|
|
1040
|
+
healthy: true,
|
|
1041
|
+
connection: {
|
|
1042
|
+
status: "connected",
|
|
1043
|
+
servers: ["nats://localhost:4222"],
|
|
1044
|
+
connected_at: "2024-01-01T00:00:00Z"
|
|
1045
|
+
},
|
|
1046
|
+
jetstream: {
|
|
1047
|
+
streams: 1,
|
|
1048
|
+
consumers: 2,
|
|
1049
|
+
memory_bytes: 104857600,
|
|
1050
|
+
storage_bytes: 1073741824
|
|
1051
|
+
},
|
|
1052
|
+
config: {
|
|
1053
|
+
stream_name: "jetstream-bridge-stream",
|
|
1054
|
+
app_name: "api",
|
|
1055
|
+
destination_app: "worker",
|
|
1056
|
+
use_outbox: true,
|
|
1057
|
+
use_inbox: true,
|
|
1058
|
+
use_dlq: true
|
|
1059
|
+
},
|
|
1060
|
+
performance: {
|
|
1061
|
+
message_processing_time_ms: 45.2,
|
|
1062
|
+
last_health_check_ms: 12.5
|
|
1063
|
+
}
|
|
1064
|
+
}
|
|
1065
|
+
```
|
|
1066
|
+
|
|
1067
|
+
### Logging
|
|
1068
|
+
|
|
1069
|
+
**Structured logging:**
|
|
1070
|
+
|
|
1071
|
+
```ruby
|
|
1072
|
+
# Publisher
|
|
1073
|
+
INFO [JetstreamBridge::Publisher] Published api.sync.worker event_id=abc123
|
|
1074
|
+
DEBUG [JetstreamBridge::Publisher] Envelope: {...}
|
|
1075
|
+
|
|
1076
|
+
# Consumer
|
|
1077
|
+
INFO [JetstreamBridge::Consumer] Processing message event_id=abc123
|
|
1078
|
+
WARN [JetstreamBridge::Consumer] Retry 3/5 for event_id=abc123
|
|
1079
|
+
ERROR [JetstreamBridge::Consumer] Unrecoverable error: ArgumentError
|
|
1080
|
+
```
|
|
1081
|
+
|
|
1082
|
+
### Metrics Points
|
|
1083
|
+
|
|
1084
|
+
**Consider tracking:**
|
|
1085
|
+
|
|
1086
|
+
- Message publish rate and latency
|
|
1087
|
+
- Message processing rate and latency
|
|
1088
|
+
- Error rates by type
|
|
1089
|
+
- DLQ message count
|
|
1090
|
+
- Inbox/outbox table sizes
|
|
1091
|
+
- Consumer lag (JetStream consumer info)
|
|
1092
|
+
|
|
1093
|
+
**Example with middleware:**
|
|
1094
|
+
|
|
1095
|
+
```ruby
|
|
1096
|
+
class MetricsMiddleware
|
|
1097
|
+
def call(event, next_middleware)
|
|
1098
|
+
start = Time.now
|
|
1099
|
+
result = next_middleware.call(event)
|
|
1100
|
+
duration = Time.now - start
|
|
1101
|
+
|
|
1102
|
+
StatsD.increment('jetstream.messages.processed')
|
|
1103
|
+
StatsD.histogram('jetstream.processing_time', duration)
|
|
1104
|
+
|
|
1105
|
+
result
|
|
1106
|
+
rescue => e
|
|
1107
|
+
StatsD.increment('jetstream.messages.failed', tags: ["error:#{e.class}"])
|
|
1108
|
+
raise
|
|
1109
|
+
end
|
|
1110
|
+
end
|
|
1111
|
+
```
|
|
1112
|
+
|
|
1113
|
+
---
|
|
1114
|
+
|
|
1115
|
+
## Best Practices
|
|
1116
|
+
|
|
1117
|
+
1. **App name without environment** - Use "api" not "api-production" for consumer name consistency
|
|
1118
|
+
2. **Idempotent handlers** - Design handlers to be safely retried
|
|
1119
|
+
3. **Enable outbox in production** - Prevents message loss on crashes
|
|
1120
|
+
4. **Enable inbox for critical flows** - Guarantees exactly-once processing
|
|
1121
|
+
5. **Monitor DLQ** - Set up alerts for messages in dead letter queue
|
|
1122
|
+
6. **Provision separately** - Use manual provisioning in locked-down environments
|
|
1123
|
+
7. **Health check endpoint** - Expose `JetstreamBridge.health_check` for monitoring
|
|
1124
|
+
8. **Graceful shutdown** - Consumer handles signals automatically
|
|
1125
|
+
9. **Test with Mock NATS** - Fast, no-infra testing (see [TESTING.md](TESTING.md))
|
|
1126
|
+
10. **Tune batch size** - Balance latency vs throughput for your workload
|
|
1127
|
+
|
|
1128
|
+
---
|
|
1129
|
+
|
|
1130
|
+
## Next Steps
|
|
1131
|
+
|
|
1132
|
+
- [Getting Started Guide](GETTING_STARTED.md) - Basic setup and usage
|
|
1133
|
+
- [Production Guide](PRODUCTION.md) - Production deployment patterns
|
|
1134
|
+
- [Restricted Permissions](RESTRICTED_PERMISSIONS.md) - Manual provisioning and security
|
|
1135
|
+
- [Testing Guide](TESTING.md) - Testing with Mock NATS
|
|
@@ -288,7 +288,7 @@ module JetstreamBridge
|
|
|
288
288
|
# Push subscriptions don't have a fetch method, so we use next_msg
|
|
289
289
|
messages = []
|
|
290
290
|
@batch_size.times do
|
|
291
|
-
msg = @psub.next_msg(FETCH_TIMEOUT_SECS)
|
|
291
|
+
msg = @psub.next_msg(timeout: FETCH_TIMEOUT_SECS)
|
|
292
292
|
messages << msg if msg
|
|
293
293
|
rescue NATS::Timeout, NATS::IO::Timeout
|
|
294
294
|
break
|
metadata
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
|
2
2
|
name: jetstream_bridge
|
|
3
3
|
version: !ruby/object:Gem::Version
|
|
4
|
-
version: 5.0.
|
|
4
|
+
version: 5.0.1
|
|
5
5
|
platform: ruby
|
|
6
6
|
authors:
|
|
7
7
|
- Mike Attara
|
|
@@ -121,6 +121,7 @@ files:
|
|
|
121
121
|
- CHANGELOG.md
|
|
122
122
|
- LICENSE
|
|
123
123
|
- README.md
|
|
124
|
+
- docs/ARCHITECTURE.md
|
|
124
125
|
- docs/GETTING_STARTED.md
|
|
125
126
|
- docs/PRODUCTION.md
|
|
126
127
|
- docs/RESTRICTED_PERMISSIONS.md
|