@techstream/quark-create-app 1.7.0 → 1.8.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,690 @@
1
+ # Quark Worker Service
2
+
3
+ Background job processor for the Quark monorepo using **BullMQ** and **Redis**. Handles asynchronous tasks like email delivery, file cleanup, and domain-specific business logic.
4
+
5
+ ## Architecture Overview
6
+
7
+ The worker service is designed around resilience, observability, and graceful degradation:
8
+
9
+ ```
10
+ ┌─────────────────────────────────────────────────────┐
11
+ │ Worker Service (apps/worker) │
12
+ ├─────────────────────────────────────────────────────┤
13
+ │ Init Phase: │
14
+ │ └─ Health checks (Redis, Database) │
15
+ │ └─ Register handlers for all queues │
16
+ │ └─ Start listening for jobs │
17
+ ├─────────────────────────────────────────────────────┤
18
+ │ Runtime Phase: │
19
+ │ └─ Dispatch jobs to registered handlers │
20
+ │ └─ Track failures with exponential backoff │
21
+ │ └─ Log events at each state transition │
22
+ │ └─ Report errors to centralized error reporter │
23
+ ├─────────────────────────────────────────────────────┤
24
+ │ Shutdown Phase: │
25
+ │ └─ Drain in-flight jobs gracefully (up to 30s) │
26
+ │ └─ Close database connection │
27
+ │ └─ Close Redis connection │
28
+ │ └─ Exit with success/failure code │
29
+ └─────────────────────────────────────────────────────┘
30
+ ```
31
+
32
+ ## Quick Start
33
+
34
+ ### Development
35
+
36
+ ```bash
37
+ # Start local infrastructure
38
+ docker compose up -d
39
+
40
+ # Run worker in watch mode
41
+ pnpm dev
42
+
43
+ # Run tests
44
+ pnpm test
45
+
46
+ # Check code style
47
+ pnpm lint
48
+ ```
49
+
50
+ ### Preflight Check (Deployment)
51
+
52
+ Before deploying, verify that the worker can connect to all required services:
53
+
54
+ ```bash
55
+ # Run health checks without starting the job listener
56
+ pnpm preflight
57
+
58
+ # Exit codes:
59
+ # 0 = All systems ready
60
+ # 1 = Connection failure or health check failed
61
+ ```
62
+
63
+ The preflight check runs `node src/index.js --preflight` and validates:
64
+ - Redis connectivity
65
+ - Database connectivity (Prisma)
66
+ - All required environment variables
67
+ - Job handler registration
68
+
69
+ ## Configuration & Environment Variables
70
+
71
+ ### Redis Configuration
72
+
73
+ ```env
74
+ # Redis connection (defaults: localhost:6379)
75
+ REDIS_HOST=localhost
76
+ REDIS_PORT=6379
77
+ REDIS_DB=0
78
+ REDIS_PASSWORD= # Optional
79
+ REDIS_TLS_CA= # Optional: Path to CA certificate
80
+ REDIS_TLS_CERT= # Optional: Path to client certificate
81
+ REDIS_TLS_KEY= # Optional: Path to client key
82
+ ```
83
+
84
+ ### Worker Configuration
85
+
86
+ ```env
87
+ # Job processing
88
+ WORKER_CONCURRENCY=5 # Number of concurrent jobs per queue
89
+ WORKER_HEALTH_RETRIES=10 # Max attempts to connect on startup
90
+ WORKER_HEALTH_INTERVAL_MS=1000 # Delay between health check attempts (ms)
91
+ WORKER_MAX_FAILURES=100 # Threshold for circuit breaker (jobs/minute)
92
+
93
+ # Graceful shutdown (in production, orchestrator timeout should be 5-10s more)
94
+ WORKER_SHUTDOWN_TIMEOUT_MS=30000 # Max time to drain in-flight jobs
95
+ ```
96
+
97
+ ### Job Queue Configuration
98
+
99
+ Job defaults are defined in `@techstream/quark-jobs`:
100
+
101
+ ```javascript
102
+ // Default retry strategy for all jobs
103
+ {
104
+ attempts: 3, // Retry up to 3 times
105
+ backoff: {
106
+ type: "exponential", // 2s → 4s → 8s delays
107
+ delay: 2000
108
+ },
109
+ removeOnComplete: true // Clean up jobs after success
110
+ }
111
+ ```
112
+
113
+ ### Database Configuration
114
+
115
+ Worker uses the local `@techstream/quark-db` package which requires:
116
+
117
+ ```env
118
+ DATABASE_URL=postgresql://user:password@localhost:5432/quark_dev
119
+ ```
120
+
121
+ ## Package.json Scripts
122
+
123
+ | Script | Purpose |
124
+ |--------|---------|
125
+ | `pnpm dev` | Run worker in watch mode (tsx watch) |
126
+ | `pnpm preflight` | Execute health checks and exit |
127
+ | `pnpm test` | Run all tests in src/*.test.js |
128
+ | `pnpm lint` | Format and lint code with Biome |
129
+
130
+ ## Job Handler Patterns
131
+
132
+ All job handlers are registered in `src/handlers/index.js`:
133
+
134
+ ```javascript
135
+ // src/handlers/email.js
136
+ export async function handleSendWelcomeEmail(bullJob, logger) {
137
+ const { userId } = bullJob.data;
138
+
139
+ if (!userId) {
140
+ throw new Error("userId is required");
141
+ }
142
+
143
+ const user = await prisma.user.findUnique({ where: { id: userId } });
144
+ if (!user?.email) {
145
+ throw new Error(`User ${userId} not found or has no email`);
146
+ }
147
+
148
+ await emailService.send({
149
+ to: user.email,
150
+ subject: "Welcome to Quark",
151
+ html: "<p>Welcome!</p>"
152
+ });
153
+
154
+ logger.info(`Welcome email sent to ${user.email}`);
155
+ return { success: true, email: user.email };
156
+ }
157
+ ```
158
+
159
+ ### Handler Signature
160
+
161
+ Every job handler receives:
162
+
163
+ ```typescript
164
+ async function jobHandler(
165
+ bullJob: {
166
+ id: string; // Unique job ID
167
+ name: string; // Job type (e.g., "send-welcome-email")
168
+ data: Object; // Payload from queue.add()
169
+ attemptsMade: number;// Number of retry attempts
170
+ },
171
+ logger: {
172
+ info: Function;
173
+ warn: Function;
174
+ error: Function;
175
+ }
176
+ ): Promise<any> {
177
+ // Return: Success result (serializable)
178
+ // Throw: Error (triggers retry or failure)
179
+ }
180
+ ```
181
+
182
+ ### Error Handling in Handlers
183
+
184
+ ```javascript
185
+ export async function handleCleanupFiles(bullJob, logger) {
186
+ const retentionHours = bullJob.data?.retentionHours || 24;
187
+
188
+ logger.info("Starting cleanup", { retentionHours });
189
+
190
+ const orphaned = await prisma.file.findMany(where: {
191
+ createdAt: { lt: cutoffDate }
192
+ });
193
+
194
+ let deleted = 0;
195
+ const errors = [];
196
+
197
+ for (const file of orphaned) {
198
+ try {
199
+ await storage.delete(file.storageKey);
200
+ await prisma.file.delete({ where: { id: file.id } });
201
+ deleted++;
202
+ } catch (err) {
203
+ // Continue processing others, log failures
204
+ errors.push({ fileId: file.id, error: err.message });
205
+ logger.warn(`Failed to delete file ${file.id}`, { error: err.message });
206
+ }
207
+ }
208
+
209
+ return {
210
+ success: errors.length < orphaned.length,
211
+ deleted,
212
+ total: orphaned.length,
213
+ errors
214
+ };
215
+ }
216
+ ```
217
+
218
+ ## Testing Module Exports
219
+
220
+ Test patterns for the resilience utilities are in `index.test.js`:
221
+
222
+ ### Testing `isConnectionError`
223
+
224
+ Detects Redis connection errors (network failures, misconfiguration):
225
+
226
+ ```javascript
227
+ import { isConnectionError } from "./index.js";
228
+
229
+ test("isConnectionError detects common connection failures", () => {
230
+ const econnRefused = new Error("connect ECONNREFUSED 127.0.0.1:6379");
231
+ assert.strictEqual(isConnectionError(econnRefused), true);
232
+
233
+ const econnReset = new Error("read ECONNRESET");
234
+ assert.strictEqual(isConnectionError(econnReset), true);
235
+
236
+ const enotFound = new Error("getaddrinfo ENOTFOUND redis.example.com");
237
+ assert.strictEqual(isConnectionError(enotFound), true);
238
+
239
+ const etimedOut = new Error("connect ETIMEDOUT");
240
+ assert.strictEqual(isConnectionError(etimedOut), true);
241
+
242
+ const otherError = new Error("Invalid queue name");
243
+ assert.strictEqual(isConnectionError(otherError), false);
244
+ });
245
+ ```
246
+
247
+ ### Testing `throttledError`
248
+
249
+ Suppresses repeated errors within a time window:
250
+
251
+ ```javascript
252
+ import { throttledError } from "./index.js";
253
+
254
+ test("throttledError suppresses duplicate errors within window", async () => {
255
+ const logger = createMockLogger();
256
+ const throttle = throttledError(logger, 100); // 100ms window
257
+
258
+ // First error logs
259
+ throttle(new Error("Redis unavailable"));
260
+ assert.strictEqual(logger.error.mock.callCount(), 1);
261
+
262
+ // Second error within window suppressed
263
+ throttle(new Error("Redis unavailable"));
264
+ assert.strictEqual(logger.error.mock.callCount(), 1);
265
+
266
+ // After window, error logs again
267
+ await new Promise(r => setTimeout(r, 150));
268
+ throttle(new Error("Redis unavailable"));
269
+ assert.strictEqual(logger.error.mock.callCount(), 2);
270
+ });
271
+ ```
272
+
273
+ ### Testing `waitForRedis`
274
+
275
+ Retries health checks with exponential backoff:
276
+
277
+ ```javascript
278
+ import { waitForRedis } from "./index.js";
279
+
280
+ test("waitForRedis retries on connection failure", async () => {
281
+ const config = {
282
+ maxRetries: 3,
283
+ intervalMs: 50
284
+ };
285
+
286
+ let attempts = 0;
287
+ const health = async () => {
288
+ attempts++;
289
+ if (attempts < 2) throw new Error("Not ready");
290
+ return true;
291
+ };
292
+
293
+ const result = await waitForRedis(health, config);
294
+ assert.strictEqual(result, true);
295
+ assert.strictEqual(attempts, 2);
296
+ });
297
+ ```
298
+
299
+ ## Docker Compose Configuration
300
+
301
+ The worker requires a properly configured Redis service. This should be present in your `docker-compose.yml`:
302
+
303
+ ```yaml
304
+ services:
305
+ redis:
306
+ image: redis:7-alpine
307
+ container_name: redis
308
+ ports:
309
+ - "${REDIS_PORT:-6379}:6379"
310
+ command: redis-server --appendonly yes
311
+ volumes:
312
+ - redis_data:/data
313
+ healthcheck:
314
+ test: ["CMD", "redis-cli", "ping"]
315
+ interval: 5s
316
+ timeout: 5s
317
+ retries: 5
318
+ ```
319
+
320
+ The `healthcheck` definition allows Docker/Kubernetes orchestrators to detect readiness:
321
+
322
+ ```bash
323
+ # Check health status
324
+ docker inspect redis | jq '.State.Health.Status'
325
+
326
+ # Watch health changes
327
+ docker inspect redis | jq '.' | grep -A 5 Health
328
+ ```
329
+
330
+ ## Deployment & Readiness Checks
331
+
332
+ ### Local/Development
333
+
334
+ ```bash
335
+ # Terminal 1: Start infrastructure
336
+ docker compose up -d
337
+
338
+ # Terminal 2: Run preflight (quick validation)
339
+ pnpm preflight
340
+
341
+ # Terminal 3: Start worker
342
+ pnpm dev
343
+ ```
344
+
345
+ ### Container Orchestration (Kubernetes, Docker Compose, Railway)
346
+
347
+ Prefer the **preflight flag** for readiness probes:
348
+
349
+ ```yaml
350
+ # Kubernetes example
351
+ apiVersion: v1
352
+ kind: Pod
353
+ metadata:
354
+ name: quark-worker
355
+ spec:
356
+ containers:
357
+ - name: worker
358
+ image: node:20-alpine
359
+ command: ["node", "src/index.js"]
360
+ readinessProbe:
361
+ exec:
362
+ command: ["node", "src/index.js", "--preflight"]
363
+ initialDelaySeconds: 5
364
+ periodSeconds: 10
365
+ timeoutSeconds: 5
366
+ livenessProbe:
367
+ exec:
368
+ command: ["node", "src/index.js", "--preflight"]
369
+ initialDelaySeconds: 30
370
+ periodSeconds: 30
371
+ ```
372
+
373
+ ```yaml
374
+ # Docker Compose example
375
+ services:
376
+ worker:
377
+ build: .
378
+ depends_on:
379
+ redis:
380
+ condition: service_healthy
381
+ postgres:
382
+ condition: service_healthy
383
+ healthcheck:
384
+ test: ["CMD", "node", "src/index.js", "--preflight"]
385
+ interval: 30s
386
+ timeout: 10s
387
+ retries: 3
388
+ ```
389
+
390
+ ### Graceful Shutdown
391
+
392
+ The worker handles `SIGTERM` and `SIGINT` signals:
393
+
394
+ ```javascript
395
+ // When orchestrator sends SIGTERM (deployment update, node drain, etc.)
396
+ process.on("SIGTERM", async () => {
397
+ logger.info("Graceful shutdown initiated");
398
+
399
+ // 1. Stop accepting new jobs
400
+ for (const worker of workers) {
401
+ await worker.close();
402
+ }
403
+
404
+ // 2. Wait for in-flight jobs (up to 30s)
405
+ // 3. Disconnect database
406
+ // 4. Exit cleanly
407
+
408
+ process.exit(0);
409
+ });
410
+ ```
411
+
412
+ **Important**: Configure your orchestrator to wait at least 40 seconds before force-killing the container:
413
+
414
+ ```yaml
415
+ # Kubernetes
416
+ terminationGracePeriodSeconds: 40
417
+
418
+ # Docker Compose
419
+ stop_grace_period: 40s
420
+ ```
421
+
422
+ ## Architectural Patterns
423
+
424
+ ### Error Handling Strategy
425
+
426
+ The worker uses a **fault-tolerant error handling** approach:
427
+
428
+ 1. **Job-Level Errors**: Failures inside handlers trigger retries with exponential backoff
429
+ 2. **Connection Errors**: Temporary network issues don't crash the process, they're throttled
430
+ 3. **Permanent Failures**: After max retries, jobs move to dead-letter queue
431
+ 4. **Process-Level Errors**: Unhandled errors exit the process (expect orchestrator restart)
432
+
433
+ ### State Management
434
+
435
+ Workers maintain minimal state:
436
+
437
+ ```javascript
438
+ // Global state (kept to a minimum)
439
+ const workers = []; // Active queue workers
440
+ let isShuttingDown = false; // Graceful shutdown flag
441
+ let connectionErrors = 0; // Failure tracking for circuit breaker
442
+
443
+ // Why minimal? Stateless workers are easier to:
444
+ // - Scale horizontally
445
+ // - Deploy without coordination
446
+ // - Replace during updates
447
+ ```
448
+
449
+ ### Connection Resilience
450
+
451
+ ```javascript
452
+ // On startup, wait for Redis with retries
453
+ await waitForRedis(checkHealth, {
454
+ maxRetries: parseInt(process.env.WORKER_HEALTH_RETRIES || "10"),
455
+ intervalMs: parseInt(process.env.WORKER_HEALTH_INTERVAL_MS || "1000")
456
+ });
457
+
458
+ // During runtime, throttle connection errors
459
+ const reportError = throttledError(logger, 5000);
460
+
461
+ // Distinguish connection issues from application errors
462
+ if (isConnectionError(error)) {
463
+ reportError(error); // Already throttled, might not log
464
+ // Don't exit; orchestrator restart will help
465
+ } else {
466
+ logger.error("Application error", { error });
467
+ // Exit on app-level errors; orchestrator should restart
468
+ }
469
+ ```
470
+
471
+ ### Graceful Shutdown
472
+
473
+ The shutdown sequence ensures no job data loss:
474
+
475
+ ```javascript
476
+ // Signal received (SIGTERM from orchestrator)
477
+ process.on("SIGTERM", () => {
478
+ // 1. Stop accepting new jobs
479
+ isShuttingDown = true;
480
+ for (const worker of workers) {
481
+ await worker.close();
482
+ }
483
+
484
+ // 2. In-flight jobs are allowed to finish (up to WORKER_SHUTDOWN_TIMEOUT_MS)
485
+ // BullMQ will wait for handlers to return/throw
486
+
487
+ // 3. Close database connection
488
+ await prisma.$disconnect();
489
+
490
+ // 4. Exit
491
+ process.exit(0);
492
+ });
493
+ ```
494
+
495
+ ## Development Experience Improvements
496
+
497
+ ### Local Development Workflow
498
+
499
+ ```bash
500
+ # 1. Terminal 1: Start all services
501
+ docker compose up -d
502
+
503
+ # 2. Terminal 2: Test database connection
504
+ pnpm db:generate # Ensure Prisma client is current
505
+ pnpm db:seed # Populate test data if needed
506
+
507
+ # 3. Terminal 3: Start worker
508
+ pnpm dev
509
+
510
+ # 4. Terminal 4 (optional): Queue jobs
511
+ node -e "
512
+ import { createQueue } from '@techstream/quark-core';
513
+ const q = createQueue('emails');
514
+ const job = await q.add('send-welcome-email', { userId: 'test-1' });
515
+ console.log('Job enqueued:', job.id);
516
+ "
517
+ ```
518
+
519
+ ### Debugging Tips
520
+
521
+ **Enable verbose logging**: Set `DEBUG=*` environment variable
522
+
523
+ ```bash
524
+ DEBUG=* pnpm dev
525
+ ```
526
+
527
+ **Inspect job state**: Use Prisma Studio to see job records
528
+
529
+ ```bash
530
+ pnpm db:studio
531
+ ```
532
+
533
+ **Monitor queue in real-time**: Connect to Redis CLI
534
+
535
+ ```bash
536
+ redis-cli
537
+ > MONITOR
538
+ > LRANGE bull:emails:wait 0 -1
539
+ ```
540
+
541
+ **Test a specific handler**:
542
+
543
+ ```bash
544
+ node -e "
545
+ import { handleSendWelcomeEmail } from './src/handlers/email.js';
546
+ const logger = console;
547
+ const job = { data: { userId: 'test-123' }, id: 'job-1' };
548
+ await handleSendWelcomeEmail(job, logger);
549
+ "
550
+ ```
551
+
552
+ ### Performance Considerations
553
+
554
+ - **Concurrency**: Default 5 jobs/queue. Adjust via `WORKER_CONCURRENCY`
555
+ - **Memory**: Each job handler runs in the same process. Memory leaks compound
556
+ - **Timeouts**: Default 30s per job. BullMQ will timeout long-running jobs
557
+ - **Database Connections**: Worker reuses a single Prisma client across all jobs
558
+
559
+ If you need horizontal scaling:
560
+
561
+ ```bash
562
+ # Each worker process is independent
563
+ # Deploy multiple instances; they'll auto-divide work via Redis
564
+
565
+ # Example: 3 workers on same Redis
566
+ NODE_INSTANCE_ID=1 pnpm dev
567
+ NODE_INSTANCE_ID=2 pnpm dev
568
+ NODE_INSTANCE_ID=3 pnpm dev
569
+ ```
570
+
571
+ ## Complete Example: Send Email Job
572
+
573
+ Here's a full example combining all patterns:
574
+
575
+ **Job Handler** (`src/handlers/email.js`):
576
+
577
+ ```javascript
578
+ import { emailService } from "@techstream/quark-core";
579
+ import { prisma } from "@techstream/quark-db";
580
+
581
+ export async function handleSendWelcomeEmail(bullJob, logger) {
582
+ const { userId } = bullJob.data;
583
+ const startTime = Date.now();
584
+
585
+ try {
586
+ // Validate input
587
+ if (!userId) {
588
+ throw new Error("userId is required");
589
+ }
590
+
591
+ // Fetch user
592
+ const user = await prisma.user.findUnique({
593
+ where: { id: userId },
594
+ select: { email: true, name: true }
595
+ });
596
+
597
+ if (!user?.email) {
598
+ throw new Error(`User ${userId} not found or has no email`);
599
+ }
600
+
601
+ // Send email
602
+ await emailService.send({
603
+ to: user.email,
604
+ template: "welcome",
605
+ subject: "Welcome!",
606
+ variables: { name: user.name }
607
+ });
608
+
609
+ const duration = Date.now() - startTime;
610
+ logger.info(`Welcome email sent to ${user.email}`, {
611
+ userId,
612
+ duration,
613
+ jobId: bullJob.id
614
+ });
615
+
616
+ return { success: true, email: user.email };
617
+ } catch (error) {
618
+ logger.error(`Failed to send welcome email`, {
619
+ userId,
620
+ error: error.message,
621
+ jobId: bullJob.id,
622
+ attempt: bullJob.attemptsMade
623
+ });
624
+ throw error; // Let BullMQ retry
625
+ }
626
+ }
627
+ ```
628
+
629
+ **Test** (`src/handlers/email.test.js`):
630
+
631
+ ```javascript
632
+ import assert from "node:assert";
633
+ import { test } from "node:test";
634
+ import { handleSendWelcomeEmail } from "./email.js";
635
+
636
+ test("handleSendWelcomeEmail sends email to valid user", async () => {
637
+ const mockLogger = { info: () => {}, error: () => {} };
638
+ const job = {
639
+ id: "job-1",
640
+ data: { userId: "user-123" },
641
+ attemptsMade: 0
642
+ };
643
+
644
+ const result = await handleSendWelcomeEmail(job, mockLogger);
645
+
646
+ assert.strictEqual(result.success, true);
647
+ assert.strictEqual(result.email, "user@example.com");
648
+ });
649
+
650
+ test("handleSendWelcomeEmail throws on missing userId", async () => {
651
+ const mockLogger = { info: () => {}, error: () => {} };
652
+ const job = {
653
+ id: "job-1",
654
+ data: {},
655
+ attemptsMade: 0
656
+ };
657
+
658
+ await assert.rejects(
659
+ () => handleSendWelcomeEmail(job, mockLogger),
660
+ { message: "userId is required" }
661
+ );
662
+ });
663
+ ```
664
+
665
+ ## FAQ
666
+
667
+ **Q: Why is my worker not starting?**
668
+ A: Check preflight: `pnpm preflight`. Common issues: Redis not running, DATABASE_URL not set, env vars not loaded.
669
+
670
+ **Q: How do I see job processing logs?**
671
+ A: Run in watch mode: `pnpm dev`. Logs include job ID, completion time, and any errors.
672
+
673
+ **Q: Can I process jobs in parallel?**
674
+ A: Yes! Increase `WORKER_CONCURRENCY`. Each worker can handle N jobs simultaneously.
675
+
676
+ **Q: What happens if a job fails?**
677
+ A: BullMQ retries with exponential backoff (default: 3 attempts, starting at 2s delay). After final retry, the job moves to a "failed" state.
678
+
679
+ **Q: How do I add a new job type?**
680
+ A: 1. Define in `@techstream/quark-jobs`
681
+ 2. Implement handler in `src/handlers/`
682
+ 3. Register in `src/handlers/index.js`
683
+ 4. Test with `src/handlers/*.test.js`
684
+
685
+ ## Further Reading
686
+
687
+ - [QUARK_USAGE.md](../docs/QUARK_USAGE.md) - Framework guidelines
688
+ - [ARCHITECTURE.md](../docs/ARCHITECTURE.md) - Core-only registry model
689
+ - [DATABASE.md](../docs/DATABASE.md) - Prisma setup and patterns
690
+ - [BullMQ Docs](https://docs.bullmq.io) - Job queue internals
@@ -8,6 +8,7 @@
8
8
  "scripts": {
9
9
  "test": "node --test $(find src -name '*.test.js')",
10
10
  "dev": "tsx watch src/index.js",
11
+ "preflight": "node src/index.js --preflight",
11
12
  "lint": "biome format --write && biome check --write"
12
13
  },
13
14
  "keywords": [],