jetstream_bridge 4.0.4 → 4.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (33) hide show
  1. checksums.yaml +4 -4
  2. data/CHANGELOG.md +106 -0
  3. data/README.md +22 -1402
  4. data/docs/GETTING_STARTED.md +92 -0
  5. data/docs/PRODUCTION.md +503 -0
  6. data/docs/TESTING.md +414 -0
  7. data/lib/jetstream_bridge/consumer/consumer.rb +101 -5
  8. data/lib/jetstream_bridge/consumer/inbox/inbox_processor.rb +17 -3
  9. data/lib/jetstream_bridge/consumer/inbox/inbox_repository.rb +19 -7
  10. data/lib/jetstream_bridge/consumer/message_processor.rb +88 -52
  11. data/lib/jetstream_bridge/consumer/subscription_manager.rb +24 -15
  12. data/lib/jetstream_bridge/core/bridge_helpers.rb +85 -0
  13. data/lib/jetstream_bridge/core/config.rb +27 -4
  14. data/lib/jetstream_bridge/core/connection.rb +162 -13
  15. data/lib/jetstream_bridge/core.rb +8 -0
  16. data/lib/jetstream_bridge/models/inbox_event.rb +13 -7
  17. data/lib/jetstream_bridge/models/outbox_event.rb +2 -2
  18. data/lib/jetstream_bridge/publisher/publisher.rb +10 -5
  19. data/lib/jetstream_bridge/rails/integration.rb +153 -0
  20. data/lib/jetstream_bridge/rails/railtie.rb +53 -0
  21. data/lib/jetstream_bridge/rails.rb +5 -0
  22. data/lib/jetstream_bridge/tasks/install.rake +1 -1
  23. data/lib/jetstream_bridge/test_helpers/fixtures.rb +41 -0
  24. data/lib/jetstream_bridge/test_helpers/integration_helpers.rb +77 -0
  25. data/lib/jetstream_bridge/test_helpers/matchers.rb +98 -0
  26. data/lib/jetstream_bridge/test_helpers/mock_nats.rb +524 -0
  27. data/lib/jetstream_bridge/test_helpers.rb +85 -121
  28. data/lib/jetstream_bridge/topology/overlap_guard.rb +46 -0
  29. data/lib/jetstream_bridge/topology/stream.rb +7 -4
  30. data/lib/jetstream_bridge/version.rb +1 -1
  31. data/lib/jetstream_bridge.rb +138 -63
  32. metadata +32 -12
  33. data/lib/jetstream_bridge/railtie.rb +0 -49
@@ -0,0 +1,92 @@
1
+ # Getting Started
2
+
3
+ This guide covers installation, Rails setup, configuration, and basic publish/consume flows.
4
+
5
+ ## Install
6
+
7
+ ```ruby
8
+ # Gemfile
9
+ gem "jetstream_bridge", "~> 4.0"
10
+ ```
11
+
12
+ ```bash
13
+ bundle install
14
+ ```
15
+
16
+ ## Rails Setup
17
+
18
+ Generate initializer, migrations, and optional health check:
19
+
20
+ ```bash
21
+ bin/rails g jetstream_bridge:install
22
+ # or separately:
23
+ bin/rails g jetstream_bridge:initializer
24
+ bin/rails g jetstream_bridge:migrations
25
+ bin/rails g jetstream_bridge:health_check
26
+ bin/rails db:migrate
27
+ ```
28
+
29
+ Generators create:
30
+
31
+ - `config/initializers/jetstream_bridge.rb`
32
+ - `db/migrate/*_create_jetstream_outbox_events.rb`
33
+ - `db/migrate/*_create_jetstream_inbox_events.rb`
34
+ - `app/controllers/jetstream_health_controller.rb` (health check)
35
+
36
+ ## Configuration
37
+
38
+ ```ruby
39
+ # config/initializers/jetstream_bridge.rb
40
+ JetstreamBridge.configure do |config|
41
+ config.nats_urls = ENV.fetch("NATS_URLS", "nats://localhost:4222")
42
+ config.env = ENV.fetch("RAILS_ENV", "development")
43
+ config.app_name = "my_app"
44
+ config.destination_app = ENV.fetch("DESTINATION_APP")
45
+
46
+ config.use_outbox = true # transactional publish
47
+ config.use_inbox = true # idempotent consume
48
+ config.use_dlq = true # route poison messages
49
+
50
+ # Consumer tuning
51
+ config.max_deliver = 5
52
+ config.ack_wait = "30s"
53
+ config.backoff = %w[1s 5s 15s 30s 60s]
54
+ end
55
+ ```
56
+
57
+ Rails autostart can be disabled for rake/console by `config.lazy_connect = true` or `JETSTREAM_BRIDGE_DISABLE_AUTOSTART=1`; it will connect on first publish/subscribe.
58
+
59
+ ## Publish
60
+
61
+ ```ruby
62
+ JetstreamBridge.publish(
63
+ event_type: "user.created",
64
+ resource_type: "user",
65
+ payload: { id: user.id, email: user.email }
66
+ )
67
+ ```
68
+
69
+ ## Consume
70
+
71
+ ```ruby
72
+ consumer = JetstreamBridge::Consumer.new do |event|
73
+ user = event.payload
74
+ User.upsert({ id: user["id"], email: user["email"] })
75
+ end
76
+
77
+ consumer.run!
78
+ ```
79
+
80
+ ## Rake Tasks
81
+
82
+ ```bash
83
+ bin/rake jetstream_bridge:health # Check health/connection
84
+ bin/rake jetstream_bridge:validate # Validate configuration
85
+ bin/rake jetstream_bridge:test_connection# Test NATS connectivity
86
+ bin/rake jetstream_bridge:debug # Dump debug info
87
+ ```
88
+
89
+ ## Next Steps
90
+
91
+ - Production hardening: [docs/PRODUCTION.md](PRODUCTION.md)
92
+ - Testing with Mock NATS: [docs/TESTING.md](TESTING.md)
@@ -0,0 +1,503 @@
1
+ # Production Deployment Guide
2
+
3
+ This guide provides recommendations for deploying JetStream Bridge in production environments with a focus on reliability, security, and performance.
4
+
5
+ ## Table of Contents
6
+
7
+ - [Database Connection Pool Sizing](#database-connection-pool-sizing)
8
+ - [NATS Connection Configuration](#nats-connection-configuration)
9
+ - [Consumer Tuning](#consumer-tuning)
10
+ - [Monitoring & Alerting](#monitoring--alerting)
11
+ - [Security Hardening](#security-hardening)
12
+ - [Kubernetes Deployment](#kubernetes-deployment)
13
+ - [Performance Optimization](#performance-optimization)
14
+
15
+ ---
16
+
17
+ ## Database Connection Pool Sizing
18
+
19
+ JetStream Bridge uses ActiveRecord for outbox/inbox patterns. Proper connection pool sizing is critical for high-throughput applications.
20
+
21
+ ### Recommended Configuration
22
+
23
+ ```ruby
24
+ # config/database.yml
25
+ production:
26
+ adapter: postgresql
27
+ pool: <%= ENV.fetch("RAILS_MAX_THREADS", 5).to_i + 10 %>
28
+ timeout: 5000
29
+ # Add buffer connections beyond web workers for consumer processes
30
+ ```
31
+
32
+ ### Sizing Guidelines
33
+
34
+ **Publishers (Web/API processes):**
35
+ - 1-2 connections per process (uses existing AR pool)
36
+ - Example: 4 Puma workers × 5 threads = 20 connections minimum
37
+
38
+ **Consumers:**
39
+ - Dedicated connections per consumer process
40
+ - Recommended: 2-5 connections per consumer
41
+ - Example: 3 consumer processes = 6-15 connections
42
+
43
+ **Total Formula:**
44
+ ```
45
+ Total Connections = (Web Workers × Threads) + (Consumers × 3) + 10 buffer
46
+ ```
47
+
48
+ ### Example Calculation
49
+
50
+ For a typical production setup:
51
+ - 4 Puma workers × 5 threads = 20 connections
52
+ - 3 consumer processes × 3 connections = 9 connections
53
+ - 10 connection buffer = 10 connections
54
+ - **Total: 39 connections**
55
+
56
+ Set your PostgreSQL `max_connections` to at least 50-60 to allow headroom.
57
+
58
+ ---
59
+
60
+ ## NATS Connection Configuration
61
+
62
+ ### High Availability Setup
63
+
64
+ Configure multiple NATS servers for fault tolerance:
65
+
66
+ ```ruby
67
+ # config/initializers/jetstream_bridge.rb
68
+ JetstreamBridge.configure do |config|
69
+ # Use cluster URLs for HA
70
+ config.nats_urls = "nats://nats1:4222,nats://nats2:4222,nats://nats3:4222"
71
+
72
+ # Adjust reconnect settings for production
73
+ config.connect_retry_attempts = 5 # Default: 3
74
+ config.connect_retry_delay = 3 # Default: 2 seconds
75
+
76
+ # Required configuration
77
+ config.env = ENV.fetch("RAILS_ENV", "production")
78
+ config.app_name = ENV.fetch("APP_NAME", "myapp")
79
+ config.destination_app = ENV.fetch("DESTINATION_APP")
80
+
81
+ # Enable reliability features
82
+ config.use_outbox = true
83
+ config.use_inbox = true
84
+ config.use_dlq = true
85
+ end
86
+ ```
87
+
88
+ ### TLS Configuration
89
+
90
+ For secure communications:
91
+
92
+ ```ruby
93
+ JetstreamBridge.configure do |config|
94
+ config.nats_urls = "nats+tls://nats:4222"
95
+ # Ensure NATS server has TLS enabled with valid certificates
96
+ end
97
+ ```
98
+
99
+ ---
100
+
101
+ ## Consumer Tuning
102
+
103
+ Optimize consumer configuration based on your workload:
104
+
105
+ ### Basic Consumer Configuration
106
+
107
+ ```ruby
108
+ JetstreamBridge.configure do |config|
109
+ # Adjust batch size based on message processing time
110
+ # Larger = better throughput, smaller = lower latency
111
+ # Default: 25
112
+ config.batch_size = 50
113
+
114
+ # Increase max_deliver for critical messages
115
+ # Default: 5
116
+ config.max_deliver = 10
117
+
118
+ # Adjust ack_wait for slow processors
119
+ # Default: 30s
120
+ config.ack_wait = "60s"
121
+
122
+ # Configure exponential backoff delays
123
+ # Default: [1s, 5s, 15s, 30s, 60s]
124
+ config.backoff = %w[2s 10s 30s 60s 120s]
125
+ end
126
+ ```
127
+
128
+ ### Consumer Best Practices
129
+
130
+ 1. **Run consumers in dedicated processes/containers** separate from web workers
131
+ 2. **Scale horizontally** by running multiple consumer instances
132
+ 3. **Monitor consumer lag** via NATS JetStream metrics
133
+ 4. **Set appropriate resource limits** (CPU, memory) in Kubernetes/Docker
134
+
135
+ ### Memory Management
136
+
137
+ Long-running consumers automatically:
138
+ - Log health checks every 10 minutes (iterations, memory, uptime)
139
+ - Warn when memory exceeds 1GB
140
+ - Suggest garbage collection when heap grows large
141
+
142
+ Monitor these logs to detect memory leaks early.
143
+
144
+ ---
145
+
146
+ ## Monitoring & Alerting
147
+
148
+ ### Key Metrics to Track
149
+
150
+ | Metric | Description | Alert Threshold |
151
+ |--------|-------------|-----------------|
152
+ | Consumer Lag | Pending messages in stream | > 1000 messages |
153
+ | DLQ Size | Messages in dead letter queue | > 100 messages |
154
+ | Connection Status | Health check failures | 2 consecutive failures |
155
+ | Processing Rate | Messages/second throughput | < expected baseline |
156
+ | Memory Usage | Consumer memory consumption | > 1GB per consumer |
157
+ | Error Rate | Failed message processing | > 5% |
158
+
159
+ ### Health Check Endpoint
160
+
161
+ Use the built-in health check for monitoring:
162
+
163
+ ```ruby
164
+ # config/routes.rb
165
+ Rails.application.routes.draw do
166
+ get '/health/jetstream', to: proc { |env|
167
+ health = JetstreamBridge.health_check
168
+ status = health[:healthy] ? 200 : 503
169
+ [status, { 'Content-Type' => 'application/json' }, [health.to_json]]
170
+ }
171
+ end
172
+ ```
173
+
174
+ ### Response Format
175
+
176
+ ```json
177
+ {
178
+ "healthy": true,
179
+ "connection": {
180
+ "state": "connected",
181
+ "connected": true,
182
+ "connected_at": "2025-11-23T10:00:00Z"
183
+ },
184
+ "stream": {
185
+ "exists": true,
186
+ "name": "production-jetstream-bridge-stream",
187
+ "subjects": ["production.app.sync.worker"],
188
+ "messages": 1523
189
+ },
190
+ "performance": {
191
+ "nats_rtt_ms": 2.5,
192
+ "health_check_duration_ms": 45.2
193
+ },
194
+ "config": {
195
+ "env": "production",
196
+ "app_name": "app",
197
+ "destination_app": "worker",
198
+ "use_outbox": true,
199
+ "use_inbox": true,
200
+ "use_dlq": true
201
+ },
202
+ "version": "4.0.3"
203
+ }
204
+ ```
205
+
206
+ ### Prometheus Metrics (Example)
207
+
208
+ ```ruby
209
+ # config/initializers/prometheus.rb
210
+ require 'prometheus/client'
211
+
212
+ prometheus = Prometheus::Client.registry
213
+
214
+ # Track message processing
215
+ jetstream_messages = prometheus.counter(
216
+ :jetstream_messages_total,
217
+ docstring: 'Total messages processed',
218
+ labels: [:status, :event_type]
219
+ )
220
+
221
+ # Track processing duration
222
+ jetstream_duration = prometheus.histogram(
223
+ :jetstream_processing_duration_seconds,
224
+ docstring: 'Message processing duration',
225
+ labels: [:event_type]
226
+ )
227
+
228
+ # Track DLQ size
229
+ jetstream_dlq = prometheus.gauge(
230
+ :jetstream_dlq_size,
231
+ docstring: 'Messages in DLQ'
232
+ )
233
+ ```
234
+
235
+ ---
236
+
237
+ ## Security Hardening
238
+
239
+ ### Rate Limiting
240
+
241
+ The health check endpoint has built-in rate limiting (1 uncached request per 5 seconds). For HTTP endpoints, add additional protection:
242
+
243
+ ```ruby
244
+ # Gemfile
245
+ gem 'rack-attack'
246
+
247
+ # config/initializers/rack_attack.rb
248
+ Rack::Attack.throttle('health_checks', limit: 30, period: 1.minute) do |req|
249
+ req.ip if req.path == '/health/jetstream'
250
+ end
251
+ ```
252
+
253
+ ### Subject Validation
254
+
255
+ JetStream Bridge validates subject components to prevent injection attacks. The following are automatically rejected:
256
+ - NATS wildcards (`.`, `*`, `>`)
257
+ - Spaces and control characters
258
+ - Components exceeding 255 characters
259
+
260
+ ### Credential Management
261
+
262
+ **Never hardcode credentials:**
263
+
264
+ ```ruby
265
+ # ❌ BAD
266
+ config.nats_urls = "nats://user:password@localhost:4222"
267
+
268
+ # ✅ GOOD
269
+ config.nats_urls = ENV.fetch("NATS_URLS")
270
+ ```
271
+
272
+ Credentials in logs are automatically sanitized:
273
+ - `nats://user:pass@host:4222` → `nats://user:***@host:4222`
274
+ - `nats://token@host:4222` → `nats://***@host:4222`
275
+
276
+ ### Network Security
277
+
278
+ 1. **Use TLS** for NATS connections in production
279
+ 2. **Restrict network access** to NATS ports (4222) via firewall rules
280
+ 3. **Use private networks** for inter-service communication
281
+ 4. **Enable authentication** on NATS server
282
+
283
+ ---
284
+
285
+ ## Kubernetes Deployment
286
+
287
+ ### Deployment Configuration
288
+
289
+ ```yaml
290
+ # deployment.yaml
291
+ apiVersion: apps/v1
292
+ kind: Deployment
293
+ metadata:
294
+ name: jetstream-consumer
295
+ spec:
296
+ replicas: 3
297
+ selector:
298
+ matchLabels:
299
+ app: jetstream-consumer
300
+ template:
301
+ metadata:
302
+ labels:
303
+ app: jetstream-consumer
304
+ spec:
305
+ containers:
306
+ - name: consumer
307
+ image: myapp:latest
308
+ command: ["bundle", "exec", "rake", "jetstream:consume"]
309
+ env:
310
+ - name: NATS_URLS
311
+ valueFrom:
312
+ secretKeyRef:
313
+ name: nats-credentials
314
+ key: urls
315
+ - name: RAILS_ENV
316
+ value: "production"
317
+ - name: APP_NAME
318
+ value: "myapp"
319
+ - name: DESTINATION_APP
320
+ value: "worker"
321
+ resources:
322
+ requests:
323
+ memory: "256Mi"
324
+ cpu: "100m"
325
+ limits:
326
+ memory: "1Gi"
327
+ cpu: "500m"
328
+ livenessProbe:
329
+ exec:
330
+ command:
331
+ - pgrep
332
+ - -f
333
+ - "rake jetstream:consume"
334
+ initialDelaySeconds: 30
335
+ periodSeconds: 10
336
+ readinessProbe:
337
+ httpGet:
338
+ path: /health/jetstream
339
+ port: 3000
340
+ initialDelaySeconds: 10
341
+ periodSeconds: 5
342
+ timeoutSeconds: 5
343
+ ```
344
+
345
+ ### Health Probes
346
+
347
+ **Liveness Probe:** Checks if the consumer process is running
348
+ ```yaml
349
+ livenessProbe:
350
+ exec:
351
+ command: ["pgrep", "-f", "jetstream:consume"]
352
+ initialDelaySeconds: 30
353
+ periodSeconds: 10
354
+ ```
355
+
356
+ **Readiness Probe:** Checks if NATS connection is healthy
357
+ ```yaml
358
+ readinessProbe:
359
+ httpGet:
360
+ path: /health/jetstream
361
+ port: 3000
362
+ initialDelaySeconds: 10
363
+ periodSeconds: 5
364
+ timeoutSeconds: 5
365
+ ```
366
+
367
+ ### Horizontal Pod Autoscaling
368
+
369
+ ```yaml
370
+ # hpa.yaml
371
+ apiVersion: autoscaling/v2
372
+ kind: HorizontalPodAutoscaler
373
+ metadata:
374
+ name: jetstream-consumer-hpa
375
+ spec:
376
+ scaleTargetRef:
377
+ apiVersion: apps/v1
378
+ kind: Deployment
379
+ name: jetstream-consumer
380
+ minReplicas: 3
381
+ maxReplicas: 10
382
+ metrics:
383
+ - type: Resource
384
+ resource:
385
+ name: cpu
386
+ target:
387
+ type: Utilization
388
+ averageUtilization: 70
389
+ - type: Resource
390
+ resource:
391
+ name: memory
392
+ target:
393
+ type: Utilization
394
+ averageUtilization: 80
395
+ ```
396
+
397
+ ---
398
+
399
+ ## Performance Optimization
400
+
401
+ ### Database Query Optimization
402
+
403
+ 1. **Add indexes** on frequently queried columns:
404
+
405
+ ```sql
406
+ -- Outbox queries
407
+ CREATE INDEX idx_outbox_status_created ON jetstream_outbox_events(status, created_at);
408
+ CREATE INDEX idx_outbox_event_id ON jetstream_outbox_events(event_id);
409
+
410
+ -- Inbox queries
411
+ CREATE INDEX idx_inbox_event_id ON jetstream_inbox_events(event_id);
412
+ CREATE INDEX idx_inbox_stream_seq ON jetstream_inbox_events(stream, stream_seq);
413
+ CREATE INDEX idx_inbox_status ON jetstream_inbox_events(status);
414
+ ```
415
+
416
+ 2. **Partition large tables** (for high-volume applications):
417
+
418
+ ```sql
419
+ -- Partition outbox by month
420
+ CREATE TABLE jetstream_outbox_events (
421
+ -- columns
422
+ ) PARTITION BY RANGE (created_at);
423
+
424
+ CREATE TABLE jetstream_outbox_events_2025_11
425
+ PARTITION OF jetstream_outbox_events
426
+ FOR VALUES FROM ('2025-11-01') TO ('2025-12-01');
427
+ ```
428
+
429
+ 3. **Archive old records** to prevent table bloat:
430
+
431
+ ```ruby
432
+ # lib/tasks/jetstream_maintenance.rake
433
+ namespace :jetstream do
434
+ desc "Archive old outbox events (older than 30 days)"
435
+ task archive_outbox: :environment do
436
+ cutoff = 30.days.ago
437
+ JetstreamBridge::OutboxEvent.where('created_at < ?', cutoff)
438
+ .where(status: 'sent')
439
+ .in_batches
440
+ .delete_all
441
+ end
442
+ end
443
+ ```
444
+
445
+ ### NATS Optimization
446
+
447
+ 1. **Use connection pooling** for high-throughput publishers
448
+ 2. **Enable stream compression** for large payloads (NATS server config)
449
+ 3. **Monitor stream storage** and set retention policies
450
+
451
+ ### Consumer Optimization
452
+
453
+ 1. **Increase batch size** for high-throughput scenarios (up to 100)
454
+ 2. **Use multiple consumers** to parallelize processing
455
+ 3. **Optimize handler code** to minimize per-message overhead
456
+ 4. **Profile memory usage** with tools like `memory_profiler` gem
457
+
458
+ ---
459
+
460
+ ## Troubleshooting
461
+
462
+ ### Common Issues
463
+
464
+ **High Consumer Lag:**
465
+ - Scale up consumer instances
466
+ - Increase batch size
467
+ - Optimize handler processing time
468
+ - Check database connection pool
469
+
470
+ **Memory Leaks:**
471
+ - Monitor consumer health logs
472
+ - Enable memory profiling
473
+ - Check for circular references in handlers
474
+ - Restart consumers periodically (Kubernetes handles this)
475
+
476
+ **Connection Issues:**
477
+ - Verify NATS server is accessible
478
+ - Check firewall rules
479
+ - Validate TLS certificates
480
+ - Review connection retry settings
481
+
482
+ **DLQ Growing:**
483
+ - Investigate failed message patterns
484
+ - Fix bugs in message handlers
485
+ - Increase max_deliver for transient errors
486
+ - Set up DLQ consumer for manual intervention
487
+
488
+ ---
489
+
490
+ ## Additional Resources
491
+
492
+ - [NATS JetStream Documentation](https://docs.nats.io/nats-concepts/jetstream)
493
+ - [PostgreSQL Connection Pooling](https://www.postgresql.org/docs/current/runtime-config-connection.html)
494
+ - [Kubernetes Best Practices](https://kubernetes.io/docs/concepts/configuration/overview/)
495
+ - [Prometheus Ruby Client](https://github.com/prometheus/client_ruby)
496
+
497
+ ---
498
+
499
+ ## Support
500
+
501
+ For issues or questions:
502
+ - GitHub Issues: https://github.com/attaradev/jetstream_bridge/issues
503
+ - Documentation: https://github.com/attaradev/jetstream_bridge