@bratsos/workflow-engine 0.0.11

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md ADDED
@@ -0,0 +1,986 @@
1
+ # @bratsos/workflow-engine
2
+
3
+ A **type-safe, distributed workflow engine** for AI-orchestrated processes. Features long-running job support, suspend/resume semantics, parallel execution, and integrated AI cost tracking.
4
+
5
+ ---
6
+
7
+ ## Table of Contents
8
+
9
+ - [Features](#features)
10
+ - [Requirements](#requirements)
11
+ - [Installation](#installation)
12
+ - [Getting Started](#getting-started)
13
+ - [1. Database Setup](#1-database-setup)
14
+ - [2. Define Your First Stage](#2-define-your-first-stage)
15
+ - [3. Build a Workflow](#3-build-a-workflow)
16
+ - [4. Create the Runtime](#4-create-the-runtime)
17
+ - [5. Run a Workflow](#5-run-a-workflow)
18
+ - [Core Concepts](#core-concepts)
19
+ - [Stages](#stages)
20
+ - [Workflows](#workflows)
21
+ - [Runtime](#runtime)
22
+ - [Persistence](#persistence)
23
+ - [Common Patterns](#common-patterns)
24
+ - [Accessing Previous Stage Output](#accessing-previous-stage-output)
25
+ - [Parallel Execution](#parallel-execution)
26
+ - [AI Integration](#ai-integration)
27
+ - [Long-Running Batch Jobs](#long-running-batch-jobs)
28
+ - [Config Presets](#config-presets)
29
+ - [Best Practices](#best-practices)
30
+ - [API Reference](#api-reference)
31
+ - [Troubleshooting](#troubleshooting)
32
+
33
+ ---
34
+
35
+ ## Features
36
+
37
+ | Feature | Description |
38
+ |---------|-------------|
39
+ | **Type-Safe** | Full TypeScript inference from input to output across all stages |
40
+ | **Async-First** | Native support for long-running operations (batch jobs that take hours/days) |
41
+ | **AI-Native** | Built-in tracking of prompts, responses, tokens, and costs |
42
+ | **Event-Driven** | Real-time progress updates via Server-Sent Events |
43
+ | **Parallel Execution** | Run independent stages concurrently |
44
+ | **Resume Capability** | Automatic state persistence and recovery from failures |
45
+ | **Distributed** | Job queue with priority support and stale lock recovery |
46
+
47
+ ---
48
+
49
+ ## Requirements
50
+
51
+ - **Node.js** >= 18.0.0
52
+ - **TypeScript** >= 5.0.0
53
+ - **PostgreSQL** >= 14 (for Prisma persistence)
54
+ - **Zod** >= 3.22.0
55
+
56
+ ### Optional Peer Dependencies
57
+
58
+ Install based on which AI providers you use:
59
+
60
+ ```bash
61
+ # For Google AI
62
+ npm install @google/genai
63
+
64
+ # For OpenAI
65
+ npm install openai
66
+
67
+ # For Anthropic
68
+ npm install @anthropic-ai/sdk
69
+
70
+ # For Prisma persistence (recommended)
71
+ npm install @prisma/client
72
+ ```
73
+
74
+ ---
75
+
76
+ ## Installation
77
+
78
+ ```bash
79
+ npm install @bratsos/workflow-engine zod
80
+ # or
81
+ pnpm add @bratsos/workflow-engine zod
82
+ # or
83
+ yarn add @bratsos/workflow-engine zod
84
+ ```
85
+
86
+ ---
87
+
88
+ ## Getting Started
89
+
90
+ This guide walks you through creating your first workflow from scratch.
91
+
92
+ ### 1. Database Setup
93
+
94
+ The engine requires persistence tables. Add these to your Prisma schema:
95
+
96
+ ```prisma
97
+ // schema.prisma
98
+
99
+ // Unified status enum for workflows, stages, and jobs
100
+ enum Status {
101
+ PENDING
102
+ RUNNING
103
+ SUSPENDED
104
+ COMPLETED
105
+ FAILED
106
+ CANCELLED
107
+ SKIPPED
108
+ }
109
+
110
+ model WorkflowRun {
111
+ id String @id @default(cuid())
112
+ createdAt DateTime @default(now())
113
+ updatedAt DateTime @updatedAt
114
+ workflowId String // e.g., "document-processor"
115
+ workflowName String // e.g., "Document Processor"
116
+ workflowType String // For grouping/filtering
117
+ status Status @default(PENDING)
118
+ startedAt DateTime?
119
+ completedAt DateTime?
120
+ duration Int? // milliseconds
121
+ input Json
122
+ output Json?
123
+ config Json @default("{}")
124
+ totalCost Float @default(0)
125
+ totalTokens Int @default(0)
126
+ priority Int @default(5)
127
+ metadata Json? // Optional domain-specific data
128
+
129
+ stages WorkflowStage[]
130
+ logs WorkflowLog[]
131
+ artifacts WorkflowArtifact[]
132
+
133
+ @@index([status])
134
+ @@index([workflowId])
135
+ }
136
+
137
+ model WorkflowStage {
138
+ id String @id @default(cuid())
139
+ createdAt DateTime @default(now())
140
+ updatedAt DateTime @updatedAt
141
+ workflowRunId String
142
+ workflowRun WorkflowRun @relation(fields: [workflowRunId], references: [id], onDelete: Cascade)
143
+ stageId String // e.g., "extract-text"
144
+ stageName String // e.g., "Extract Text"
145
+ stageNumber Int // 1-based execution order
146
+ executionGroup Int // For parallel grouping
147
+ status Status @default(PENDING)
148
+ startedAt DateTime?
149
+ completedAt DateTime?
150
+ duration Int? // milliseconds
151
+ inputData Json?
152
+ outputData Json? // May contain { _artifactKey: "..." }
153
+ config Json?
154
+ suspendedState Json? // State for async batch stages
155
+ resumeData Json?
156
+ nextPollAt DateTime? // When to check again
157
+ pollInterval Int? // milliseconds
158
+ maxWaitUntil DateTime?
159
+ metrics Json? // { cost, tokens, custom... }
160
+ embeddingInfo Json?
161
+ errorMessage String?
162
+
163
+ logs WorkflowLog[]
164
+
165
+ @@unique([workflowRunId, stageId])
166
+ @@index([status])
167
+ @@index([nextPollAt])
168
+ }
169
+
170
+ model WorkflowLog {
171
+ id String @id @default(cuid())
172
+ createdAt DateTime @default(now())
173
+ workflowRunId String?
174
+ workflowRun WorkflowRun? @relation(fields: [workflowRunId], references: [id], onDelete: Cascade)
175
+ workflowStageId String?
176
+ workflowStage WorkflowStage? @relation(fields: [workflowStageId], references: [id], onDelete: Cascade)
177
+ level String // DEBUG, INFO, WARN, ERROR
178
+ message String
179
+ metadata Json?
180
+
181
+ @@index([workflowRunId])
182
+ @@index([workflowStageId])
183
+ }
184
+
185
+ model WorkflowArtifact {
186
+ id String @id @default(cuid())
187
+ createdAt DateTime @default(now())
188
+ workflowRunId String
189
+ workflowRun WorkflowRun @relation(fields: [workflowRunId], references: [id], onDelete: Cascade)
190
+ key String // Unique key within the run
191
+ type String // STAGE_OUTPUT, ARTIFACT, METADATA
192
+ data Json
193
+ size Int // bytes
194
+
195
+ @@unique([workflowRunId, key])
196
+ @@index([workflowRunId])
197
+ }
198
+
199
+ model AICall {
200
+ id String @id @default(cuid())
201
+ createdAt DateTime @default(now())
202
+ topic String // Hierarchical: "workflow.{runId}.stage.{stageId}"
203
+ callType String // text, object, embed, stream
204
+ modelKey String // e.g., "gemini-2.5-flash"
205
+ modelId String // e.g., "google/gemini-2.5-flash-preview"
206
+ prompt String @db.Text
207
+ response String @db.Text
208
+ inputTokens Int
209
+ outputTokens Int
210
+ cost Float
211
+
212
+ @@index([topic])
213
+ }
214
+
215
+ model JobQueue {
216
+ id String @id @default(cuid())
217
+ createdAt DateTime @default(now())
218
+ updatedAt DateTime @updatedAt
219
+ workflowRunId String
220
+ stageId String
221
+ status Status @default(PENDING)
222
+ priority Int @default(5)
223
+ attempt Int @default(1)
224
+ maxAttempts Int @default(3)
225
+ workerId String?
226
+ lockedAt DateTime?
227
+ nextPollAt DateTime?
228
+ payload Json?
229
+ lastError String?
230
+
231
+ @@index([status, priority])
232
+ @@index([nextPollAt])
233
+ }
234
+ ```
235
+
236
+ Run the migration:
237
+
238
+ ```bash
239
+ npx prisma migrate dev --name add-workflow-tables
240
+ npx prisma generate
241
+ ```
242
+
243
+ ### 2. Define Your First Stage
244
+
245
+ Create a file `stages/extract-text.ts`:
246
+
247
+ ```typescript
248
+ import { defineStage } from "@bratsos/workflow-engine";
249
+ import { z } from "zod";
250
+
251
+ // Define schemas for type safety
252
+ const InputSchema = z.object({
253
+ url: z.string().url().describe("URL of the document to process"),
254
+ });
255
+
256
+ const OutputSchema = z.object({
257
+ text: z.string().describe("Extracted text content"),
258
+ wordCount: z.number().describe("Number of words extracted"),
259
+ metadata: z.object({
260
+ title: z.string().optional(),
261
+ author: z.string().optional(),
262
+ }),
263
+ });
264
+
265
+ const ConfigSchema = z.object({
266
+ maxLength: z.number().default(50000).describe("Maximum text length to extract"),
267
+ includeMetadata: z.boolean().default(true),
268
+ });
269
+
270
+ export const extractTextStage = defineStage({
271
+ id: "extract-text",
272
+ name: "Extract Text",
273
+ description: "Extracts text content from a document URL",
274
+
275
+ schemas: {
276
+ input: InputSchema,
277
+ output: OutputSchema,
278
+ config: ConfigSchema,
279
+ },
280
+
281
+ async execute(ctx) {
282
+ const { url } = ctx.input;
283
+ const { maxLength, includeMetadata } = ctx.config;
284
+
285
+ // Log progress
286
+ ctx.log("INFO", "Starting text extraction", { url });
287
+
288
+ // Simulate fetching document (replace with real implementation)
289
+ const response = await fetch(url);
290
+ const text = await response.text();
291
+ const truncatedText = text.slice(0, maxLength);
292
+
293
+ ctx.log("INFO", "Extraction complete", {
294
+ originalLength: text.length,
295
+ truncatedLength: truncatedText.length,
296
+ });
297
+
298
+ return {
299
+ output: {
300
+ text: truncatedText,
301
+ wordCount: truncatedText.split(/\s+/).length,
302
+ metadata: includeMetadata ? { title: "Document Title" } : {},
303
+ },
304
+ // Optional: custom metrics for observability
305
+ customMetrics: {
306
+ bytesProcessed: text.length,
307
+ },
308
+ };
309
+ },
310
+ });
311
+
312
+ // Export types for use in other stages
313
+ export type ExtractTextOutput = z.infer<typeof OutputSchema>;
314
+ ```
315
+
316
+ ### 3. Build a Workflow
317
+
318
+ Create a file `workflows/document-processor.ts`:
319
+
320
+ ```typescript
321
+ import { WorkflowBuilder } from "@bratsos/workflow-engine";
322
+ import { z } from "zod";
323
+ import { extractTextStage } from "../stages/extract-text";
324
+ import { summarizeStage } from "../stages/summarize";
325
+ import { analyzeStage } from "../stages/analyze";
326
+
327
+ const InputSchema = z.object({
328
+ url: z.string().url(),
329
+ options: z.object({
330
+ generateSummary: z.boolean().default(true),
331
+ analyzeContent: z.boolean().default(true),
332
+ }).optional(),
333
+ });
334
+
335
+ export const documentProcessorWorkflow = new WorkflowBuilder(
336
+ "document-processor", // Unique ID
337
+ "Document Processor", // Display name
338
+ "Extracts, summarizes, and analyzes documents",
339
+ InputSchema,
340
+ InputSchema // Initial output type (will be inferred)
341
+ )
342
+ .pipe(extractTextStage) // Stage 1: Extract
343
+ .pipe(summarizeStage) // Stage 2: Summarize
344
+ .pipe(analyzeStage) // Stage 3: Analyze
345
+ .build();
346
+
347
+ // Export input type for API consumers
348
+ export type DocumentProcessorInput = z.infer<typeof InputSchema>;
349
+ ```
350
+
351
+ ### 4. Create the Runtime
352
+
353
+ Create a file `runtime.ts`:
354
+
355
+ ```typescript
356
+ import {
357
+ createWorkflowRuntime,
358
+ createPrismaWorkflowPersistence,
359
+ createPrismaJobQueue,
360
+ createPrismaAICallLogger,
361
+ type WorkflowRegistry,
362
+ } from "@bratsos/workflow-engine";
363
+ import { PrismaClient } from "@prisma/client";
364
+ import { documentProcessorWorkflow } from "./workflows/document-processor";
365
+
366
+ // Initialize Prisma client
367
+ const prisma = new PrismaClient();
368
+
369
+ // Create persistence implementations
370
+ const persistence = createPrismaWorkflowPersistence(prisma);
371
+ const jobQueue = createPrismaJobQueue(prisma);
372
+ const aiCallLogger = createPrismaAICallLogger(prisma);
373
+
374
+ // Create a workflow registry
375
+ const registry: WorkflowRegistry = {
376
+ getWorkflow(id: string) {
377
+ const workflows: Record<string, any> = {
378
+ "document-processor": documentProcessorWorkflow,
379
+ // Add more workflows here
380
+ };
381
+ return workflows[id];
382
+ },
383
+ };
384
+
385
+ // Optional: Define priority function for different workflow types
386
+ function getWorkflowPriority(workflowId: string): number {
387
+ const priorities: Record<string, number> = {
388
+ "document-processor": 5, // Normal priority
389
+ "urgent-task": 10, // High priority
390
+ "background-job": 2, // Low priority
391
+ };
392
+ return priorities[workflowId] ?? 5;
393
+ }
394
+
395
+ // Create the runtime
396
+ export const runtime = createWorkflowRuntime({
397
+ persistence,
398
+ jobQueue,
399
+ registry,
400
+ aiCallLogger,
401
+ pollIntervalMs: 10000, // Check for suspended stages every 10s
402
+ jobPollIntervalMs: 1000, // Poll job queue every 1s
403
+ getWorkflowPriority, // Optional priority function
404
+ });
405
+
406
+ // Convenience exports
407
+ export { prisma, persistence, jobQueue };
408
+ ```
409
+
410
+ ### 5. Run a Workflow
411
+
412
+ #### Option A: Worker Process (Recommended for Production)
413
+
414
+ Create a file `worker.ts`:
415
+
416
+ ```typescript
417
+ import { runtime } from "./runtime";
418
+
419
+ // Register shutdown handlers
420
+ process.on("SIGTERM", () => runtime.stop());
421
+ process.on("SIGINT", () => runtime.stop());
422
+
423
+ // Start the worker
424
+ console.log("Starting workflow worker...");
425
+ runtime.start();
426
+ ```
427
+
428
+ Run the worker:
429
+
430
+ ```bash
431
+ npx tsx worker.ts
432
+ ```
433
+
434
+ #### Option B: Queue from API Endpoint
435
+
436
+ ```typescript
437
+ import { runtime } from "./runtime";
438
+
439
+ // In your API route handler
440
+ export async function POST(request: Request) {
441
+ const { url } = await request.json();
442
+
443
+ const { workflowRunId } = await runtime.createRun({
444
+ workflowId: "document-processor",
445
+ input: { url },
446
+ config: {
447
+ "extract-text": { maxLength: 100000 },
448
+ "summarize": { modelKey: "gemini-2.5-flash" },
449
+ },
450
+ });
451
+
452
+ return Response.json({ workflowRunId });
453
+ }
454
+ ```
455
+
456
+ ---
457
+
458
+ ## Core Concepts
459
+
460
+ ### Stages
461
+
462
+ A stage is the atomic unit of work. Every stage has:
463
+
464
+ | Property | Description |
465
+ |----------|-------------|
466
+ | `id` | Unique identifier within the workflow |
467
+ | `name` | Human-readable name |
468
+ | `inputSchema` | Zod schema for input validation |
469
+ | `outputSchema` | Zod schema for output validation |
470
+ | `configSchema` | Zod schema for stage configuration |
471
+ | `execute(ctx)` | The function that performs the work |
472
+
473
+ **Stage Modes:**
474
+
475
+ | Mode | Use Case |
476
+ |------|----------|
477
+ | `sync` (default) | Most stages - execute and return immediately |
478
+ | `async-batch` | Long-running batch APIs (OpenAI Batch, Google Batch, etc.) |
479
+
480
+ ### Workflows
481
+
482
+ A workflow is a directed graph of stages built using the fluent `WorkflowBuilder` API:
483
+
484
+ ```typescript
485
+ new WorkflowBuilder(id, name, description, inputSchema, outputSchema)
486
+ .pipe(stageA) // Sequential: stageA runs first
487
+ .pipe(stageB) // Sequential: stageB runs after stageA
488
+ .parallel([stageC, stageD]) // Parallel: stageC and stageD run concurrently
489
+ .pipe(stageE) // Sequential: stageE runs after both complete
490
+ .build();
491
+ ```
492
+
493
+ ### Runtime
494
+
495
+ The `WorkflowRuntime` manages workflow execution:
496
+
497
+ ```typescript
498
+ const runtime = createWorkflowRuntime({
499
+ persistence, // Database operations
500
+ jobQueue, // Distributed job queue
501
+ registry, // Workflow definitions
502
+ aiCallLogger, // AI call tracking (optional)
503
+ });
504
+
505
+ // Start as a worker process
506
+ await runtime.start();
507
+
508
+ // Queue a new workflow run
509
+ await runtime.createRun({ workflowId, input, config });
510
+
511
+ // Create AI helper for a stage
512
+ const ai = runtime.createAIHelper("workflow.run-123.stage.extract");
513
+ ```
514
+
515
+ ### Persistence
516
+
517
+ The engine uses three persistence interfaces:
518
+
519
+ | Interface | Purpose |
520
+ |-----------|---------|
521
+ | `WorkflowPersistence` | Workflow runs, stages, logs, artifacts |
522
+ | `JobQueue` | Distributed job queue with priority and retries |
523
+ | `AICallLogger` | AI call tracking with cost aggregation |
524
+
525
+ **Built-in implementations:**
526
+ - `createPrismaWorkflowPersistence(prisma)` - PostgreSQL via Prisma
527
+ - `createPrismaJobQueue(prisma)` - PostgreSQL with `FOR UPDATE SKIP LOCKED`
528
+ - `createPrismaAICallLogger(prisma)` - PostgreSQL
529
+
530
+ **Prisma version compatibility:**
531
+
532
+ The persistence layer supports both Prisma 6.x and 7.x. Enum values are automatically resolved based on your Prisma version:
533
+
534
+ ```typescript
535
+ import { createPrismaWorkflowPersistence } from "@bratsos/workflow-engine/persistence/prisma";
536
+
537
+ // Works with both Prisma 6.x (string enums) and Prisma 7.x (typed enums)
538
+ const persistence = createPrismaWorkflowPersistence(prisma);
539
+ ```
540
+
541
+ For advanced use cases, you can use the enum helper directly:
542
+
543
+ ```typescript
544
+ import { createEnumHelper } from "@bratsos/workflow-engine/persistence/prisma";
545
+
546
+ const enums = createEnumHelper(prisma);
547
+ const status = enums.status("PENDING"); // Returns typed enum for Prisma 7.x, string for 6.x
548
+ ```
549
+
550
+ ---
551
+
552
+ ## Common Patterns
553
+
554
+ ### Accessing Previous Stage Output
555
+
556
+ Use `ctx.require()` for type-safe access to any previous stage's output:
557
+
558
+ ```typescript
559
+ export const analyzeStage = defineStage({
560
+ id: "analyze",
561
+ name: "Analyze Content",
562
+
563
+ schemas: {
564
+ input: "none", // This stage reads from workflowContext
565
+ output: AnalysisOutputSchema,
566
+ config: ConfigSchema,
567
+ },
568
+
569
+ async execute(ctx) {
570
+ // Type-safe access to previous stage output
571
+ const extracted = ctx.require("extract-text"); // Throws if missing
572
+ const summary = ctx.optional("summarize"); // Returns undefined if missing
573
+
574
+ // Use the data
575
+ console.log(`Analyzing ${extracted.wordCount} words`);
576
+
577
+ return {
578
+ output: { /* ... */ },
579
+ };
580
+ },
581
+ });
582
+ ```
583
+
584
+ > **Important:** Use `input: "none"` for stages that read from `workflowContext` instead of receiving direct input from the previous stage.
585
+
586
+ ### Parallel Execution
587
+
588
+ Run independent stages concurrently:
589
+
590
+ ```typescript
591
+ const workflow = new WorkflowBuilder(/* ... */)
592
+ .pipe(extractStage)
593
+ .parallel([
594
+ sentimentAnalysisStage, // These three run
595
+ keywordExtractionStage, // at the same time
596
+ languageDetectionStage,
597
+ ])
598
+ .pipe(aggregateResultsStage) // Runs after all parallel stages complete
599
+ .build();
600
+ ```
601
+
602
+ In subsequent stages, access parallel outputs by stage ID:
603
+
604
+ ```typescript
605
+ async execute(ctx) {
606
+ const sentiment = ctx.require("sentiment-analysis");
607
+ const keywords = ctx.require("keyword-extraction");
608
+ const language = ctx.require("language-detection");
609
+ // ...
610
+ }
611
+ ```
612
+
613
+ ### AI Integration
614
+
615
+ Use the `AIHelper` for tracked AI calls:
616
+
617
+ ```typescript
618
+ import { createAIHelper, type AIHelper } from "@bratsos/workflow-engine";
619
+
620
+ async execute(ctx) {
621
+ // Create AI helper with topic for cost tracking
622
+ const ai = runtime.createAIHelper(`workflow.${ctx.workflowRunId}.${ctx.stageId}`);
623
+
624
+ // Generate text
625
+ const { text, cost, inputTokens, outputTokens } = await ai.generateText(
626
+ "gemini-2.5-flash",
627
+ "Summarize this document: " + ctx.input.text
628
+ );
629
+
630
+ // Generate structured object
631
+ const { object: analysis } = await ai.generateObject(
632
+ "gemini-2.5-flash",
633
+ "Analyze this text: " + ctx.input.text,
634
+ z.object({
635
+ sentiment: z.enum(["positive", "negative", "neutral"]),
636
+ topics: z.array(z.string()),
637
+ })
638
+ );
639
+
640
+ // Generate embeddings
641
+ const { embedding, dimensions } = await ai.embed(
642
+ "gemini-embedding-001",
643
+ ctx.input.text,
644
+ { dimensions: 768 }
645
+ );
646
+
647
+ // All calls are automatically logged with cost tracking
648
+ return { output: { text, analysis, embedding } };
649
+ }
650
+ ```
651
+
652
+ ### Long-Running Batch Jobs
653
+
654
+ For operations that may take hours (OpenAI Batch API, Google Vertex Batch, etc.):
655
+
656
+ ```typescript
657
+ import { defineAsyncBatchStage } from "@bratsos/workflow-engine";
658
+
659
+ export const batchProcessingStage = defineAsyncBatchStage({
660
+ id: "batch-process",
661
+ name: "Batch Processing",
662
+ mode: "async-batch", // Required for async stages
663
+
664
+ schemas: { input: InputSchema, output: OutputSchema, config: ConfigSchema },
665
+
666
+ async execute(ctx) {
667
+ // If we're resuming from suspension, the batch is complete
668
+ if (ctx.resumeState) {
669
+ const results = await fetchBatchResults(ctx.resumeState.batchId);
670
+ return { output: results };
671
+ }
672
+
673
+ // First execution: submit the batch job
674
+ const batch = await submitBatchToOpenAI(ctx.input.prompts);
675
+
676
+ // Return suspended state - workflow will pause here
677
+ return {
678
+ suspended: true,
679
+ state: {
680
+ batchId: batch.id,
681
+ submittedAt: new Date().toISOString(),
682
+ pollInterval: 3600000, // 1 hour
683
+ maxWaitTime: 86400000, // 24 hours
684
+ },
685
+ pollConfig: {
686
+ pollInterval: 3600000,
687
+ maxWaitTime: 86400000,
688
+ nextPollAt: new Date(Date.now() + 3600000),
689
+ },
690
+ };
691
+ },
692
+
693
+ // Called by the orchestrator to check if batch is ready
694
+ async checkCompletion(state, ctx) {
695
+ const status = await checkBatchStatus(state.batchId);
696
+
697
+ if (status === "completed") {
698
+ return { ready: true }; // Workflow will resume
699
+ }
700
+
701
+ if (status === "failed") {
702
+ return { ready: true, error: "Batch processing failed" };
703
+ }
704
+
705
+ // Not ready yet - check again later
706
+ return {
707
+ ready: false,
708
+ nextCheckIn: 60000, // Override poll interval for this check
709
+ };
710
+ },
711
+ });
712
+ ```
713
+
714
+ ### Config Presets
715
+
716
+ Use built-in config presets for common patterns:
717
+
718
+ ```typescript
719
+ import {
720
+ withAIConfig,
721
+ withStandardConfig,
722
+ withFullConfig
723
+ } from "@bratsos/workflow-engine";
724
+ import { z } from "zod";
725
+
726
+ // AI-focused config (model, temperature, maxTokens)
727
+ const MyConfigSchema = withAIConfig(z.object({
728
+ customField: z.string(),
729
+ }));
730
+
731
+ // Standard config (AI + concurrency + featureFlags)
732
+ const StandardConfigSchema = withStandardConfig(z.object({
733
+ specificOption: z.boolean().default(true),
734
+ }));
735
+
736
+ // Full config (Standard + debug options)
737
+ const DebugConfigSchema = withFullConfig(z.object({
738
+ processType: z.string(),
739
+ }));
740
+ ```
741
+
742
+ ---
743
+
744
+ ## Best Practices
745
+
746
+ ### 1. Schema Design
747
+
748
+ ```typescript
749
+ // ✅ Good: Strict schemas with descriptions and defaults
750
+ const ConfigSchema = z.object({
751
+ modelKey: z.string()
752
+ .default("gemini-2.5-flash")
753
+ .describe("AI model to use for processing"),
754
+ maxRetries: z.number()
755
+ .min(0)
756
+ .max(10)
757
+ .default(3)
758
+ .describe("Maximum retry attempts on failure"),
759
+ });
760
+
761
+ // ❌ Bad: Loose typing
762
+ const ConfigSchema = z.object({
763
+ model: z.any(),
764
+ retries: z.number(),
765
+ });
766
+ ```
767
+
768
+ ### 2. Stage Dependencies
769
+
770
+ ```typescript
771
+ // ✅ Good: Declare dependencies explicitly
772
+ export const analyzeStage = defineStage({
773
+ id: "analyze",
774
+ dependencies: ["extract-text", "summarize"], // Build-time validation
775
+ schemas: { input: "none", /* ... */ },
776
+ // ...
777
+ });
778
+
779
+ // ❌ Bad: Hidden dependencies
780
+ export const analyzeStage = defineStage({
781
+ id: "analyze",
782
+ schemas: { input: SomeSchema, /* ... */ }, // Where does input come from?
783
+ // ...
784
+ });
785
+ ```
786
+
787
+ ### 3. Logging
788
+
789
+ ```typescript
790
+ async execute(ctx) {
791
+ // ✅ Good: Structured logging with context
792
+ ctx.log("INFO", "Starting processing", {
793
+ itemCount: items.length,
794
+ config: ctx.config,
795
+ });
796
+
797
+ // ✅ Good: Progress updates for long operations
798
+ for (const [index, item] of items.entries()) {
799
+ ctx.onProgress({
800
+ progress: (index + 1) / items.length,
801
+ message: `Processing item ${index + 1}/${items.length}`,
802
+ });
803
+ }
804
+
805
+ // ❌ Bad: Console.log (not persisted)
806
+ console.log("Processing...");
807
+ }
808
+ ```
809
+
810
+ ### 4. Error Handling
811
+
812
+ ```typescript
813
+ async execute(ctx) {
814
+ try {
815
+ // Validate early
816
+ if (!ctx.input.url.startsWith("https://")) {
817
+ throw new Error("Only HTTPS URLs are supported");
818
+ }
819
+
820
+ const result = await processDocument(ctx.input);
821
+ return { output: result };
822
+
823
+ } catch (error) {
824
+ // Log errors with context
825
+ ctx.log("ERROR", "Processing failed", {
826
+ error: error instanceof Error ? error.message : String(error),
827
+ input: ctx.input,
828
+ });
829
+
830
+ // Re-throw to mark stage as failed
831
+ throw error;
832
+ }
833
+ }
834
+ ```
835
+
836
+ ### 5. Performance
837
+
838
+ ```typescript
839
+ // ✅ Good: Use parallel stages for independent work
840
+ .parallel([
841
+ fetchDataFromSourceA,
842
+ fetchDataFromSourceB,
843
+ fetchDataFromSourceC,
844
+ ])
845
+
846
+ // ✅ Good: Use batch processing for many AI calls
847
+ const results = await ai.batch("gpt-4o").submit(
848
+ items.map((item, i) => ({
849
+ id: `item-${i}`,
850
+ prompt: `Process: ${item}`,
851
+ schema: OutputSchema,
852
+ }))
853
+ );
854
+
855
+ // ❌ Bad: Sequential AI calls when parallel would work
856
+ for (const item of items) {
857
+ await ai.generateText("gpt-4o", `Process: ${item}`);
858
+ }
859
+ ```
860
+
861
+ ---
862
+
863
+ ## API Reference
864
+
865
+ ### Core Exports
866
+
867
+ ```typescript
868
+ // Stage definition
869
+ import { defineStage, defineAsyncBatchStage } from "@bratsos/workflow-engine";
870
+
871
+ // Workflow building
872
+ import { WorkflowBuilder, Workflow } from "@bratsos/workflow-engine";
873
+
874
+ // Runtime
875
+ import { createWorkflowRuntime, WorkflowRuntime } from "@bratsos/workflow-engine";
876
+
877
+ // Persistence (Prisma)
878
+ import {
879
+ createPrismaWorkflowPersistence,
880
+ createPrismaJobQueue,
881
+ createPrismaAICallLogger,
882
+ } from "@bratsos/workflow-engine";
883
+
884
+ // AI Helper
885
+ import { createAIHelper, type AIHelper } from "@bratsos/workflow-engine";
886
+
887
+ // Types
888
+ import type {
889
+ WorkflowPersistence,
890
+ JobQueue,
891
+ AICallLogger,
892
+ WorkflowRunRecord,
893
+ WorkflowStageRecord,
894
+ } from "@bratsos/workflow-engine";
895
+ ```
896
+
897
+ ### `createWorkflowRuntime(config)`
898
+
899
+ | Option | Type | Default | Description |
900
+ |--------|------|---------|-------------|
901
+ | `persistence` | `WorkflowPersistence` | required | Database operations |
902
+ | `jobQueue` | `JobQueue` | required | Job queue implementation |
903
+ | `registry` | `WorkflowRegistry` | required | Workflow definitions |
904
+ | `aiCallLogger` | `AICallLogger` | optional | AI call tracking |
905
+ | `pollIntervalMs` | `number` | 10000 | Interval for checking suspended stages |
906
+ | `jobPollIntervalMs` | `number` | 1000 | Interval for polling job queue |
907
+ | `workerId` | `string` | auto | Unique worker identifier |
908
+ | `staleJobThresholdMs` | `number` | 60000 | Time before a job is considered stale |
909
+ | `getWorkflowPriority` | `(id: string) => number` | optional | Priority function (1-10, higher = more important) |
910
+
911
+ ### `WorkflowRuntime`
912
+
913
+ | Method | Description |
914
+ |--------|-------------|
915
+ | `start()` | Start polling for jobs (runs forever until stopped) |
916
+ | `stop()` | Stop the worker gracefully |
917
+ | `createRun(options)` | Queue a new workflow run |
918
+ | `createAIHelper(topic)` | Create an AI helper bound to the logger |
919
+ | `transitionWorkflow(runId)` | Manually transition a workflow to next stage |
920
+ | `pollSuspendedStages()` | Manually poll suspended stages |
921
+
922
+ ### `CreateRunOptions`
923
+
924
+ | Option | Type | Required | Description |
925
+ |--------|------|----------|-------------|
926
+ | `workflowId` | `string` | yes | ID of the workflow to run |
927
+ | `input` | `object` | yes | Input data matching workflow's input schema |
928
+ | `config` | `object` | no | Stage configurations keyed by stage ID |
929
+ | `priority` | `number` | no | Priority (1-10, overrides `getWorkflowPriority`) |
930
+ | `metadata` | `object` | no | Domain-specific metadata |
931
+
932
+ ---
933
+
934
+ ## Troubleshooting
935
+
936
+ ### "Workflow not found in registry"
937
+
938
+ Ensure the workflow is registered before creating runs:
939
+
940
+ ```typescript
941
+ const registry = {
942
+ getWorkflow(id) {
943
+ const workflows = {
944
+ "my-workflow": myWorkflow, // Add your workflow here
945
+ };
946
+ return workflows[id];
947
+ },
948
+ };
949
+ ```
950
+
951
+ ### "Stage X depends on Y which was not found"
952
+
953
+ Verify all dependencies are included in the workflow:
954
+
955
+ ```typescript
956
+ // If stage "analyze" depends on "extract":
957
+ .pipe(extractStage) // Must be piped before
958
+ .pipe(analyzeStage) // analyze can now access extract's output
959
+ ```
960
+
961
+ ### Jobs stuck in "PROCESSING"
962
+
963
+ This usually means a worker crashed. The stale lock recovery will automatically release jobs after `staleJobThresholdMs` (default 60s).
964
+
965
+ To manually release stale jobs:
966
+
967
+ ```typescript
968
+ await jobQueue.releaseStaleJobs(60000); // Release jobs locked > 60s
969
+ ```
970
+
971
+ ### AI calls not being tracked
972
+
973
+ Ensure you pass `aiCallLogger` to the runtime:
974
+
975
+ ```typescript
976
+ const runtime = createWorkflowRuntime({
977
+ // ...
978
+ aiCallLogger: createPrismaAICallLogger(prisma), // Required for tracking
979
+ });
980
+ ```
981
+
982
+ ---
983
+
984
+ ## License
985
+
986
+ MIT