@pgflow/dsl 0.0.5 → 0.0.7

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (71) hide show
  1. package/{CHANGELOG.md → dist/CHANGELOG.md} +2 -0
  2. package/package.json +10 -8
  3. package/__tests__/runtime/flow.test.ts +0 -121
  4. package/__tests__/runtime/steps.test.ts +0 -183
  5. package/__tests__/runtime/utils.test.ts +0 -149
  6. package/__tests__/types/dsl-types.test-d.ts +0 -103
  7. package/__tests__/types/example-flow.test-d.ts +0 -76
  8. package/__tests__/types/extract-flow-input.test-d.ts +0 -71
  9. package/__tests__/types/extract-flow-steps.test-d.ts +0 -74
  10. package/__tests__/types/getStepDefinition.test-d.ts +0 -65
  11. package/__tests__/types/step-input.test-d.ts +0 -212
  12. package/__tests__/types/step-output.test-d.ts +0 -55
  13. package/brainstorming/condition/condition-alternatives.md +0 -219
  14. package/brainstorming/condition/condition-with-flexibility.md +0 -303
  15. package/brainstorming/condition/condition.md +0 -139
  16. package/brainstorming/condition/implementation-plan.md +0 -372
  17. package/brainstorming/dsl/cli-json-schema.md +0 -225
  18. package/brainstorming/dsl/cli.md +0 -179
  19. package/brainstorming/dsl/create-compilator.md +0 -25
  20. package/brainstorming/dsl/dsl-analysis-2.md +0 -166
  21. package/brainstorming/dsl/dsl-analysis.md +0 -512
  22. package/brainstorming/dsl/dsl-critique.md +0 -41
  23. package/brainstorming/fanouts/fanout-subflows-flattened-vs-subruns.md +0 -213
  24. package/brainstorming/fanouts/fanouts-task-index.md +0 -150
  25. package/brainstorming/fanouts/fanouts-with-conditions-and-subflows.md +0 -239
  26. package/brainstorming/subflows/branching.ts.md +0 -38
  27. package/brainstorming/subflows/subflows-callbacks.ts.md +0 -124
  28. package/brainstorming/subflows/subflows-classes.ts.md +0 -83
  29. package/brainstorming/subflows/subflows-flattening-versioned.md +0 -119
  30. package/brainstorming/subflows/subflows-flattening.md +0 -138
  31. package/brainstorming/subflows/subflows.md +0 -118
  32. package/brainstorming/subflows/subruns-table.md +0 -282
  33. package/brainstorming/subflows/subruns.md +0 -315
  34. package/brainstorming/versioning/breaking-and-non-breaking-flow-changes.md +0 -259
  35. package/docs/refactor-edge-worker.md +0 -146
  36. package/docs/versioning.md +0 -19
  37. package/eslint.config.cjs +0 -22
  38. package/out-tsc/vitest/__tests__/runtime/flow.test.d.ts +0 -2
  39. package/out-tsc/vitest/__tests__/runtime/flow.test.d.ts.map +0 -1
  40. package/out-tsc/vitest/__tests__/runtime/steps.test.d.ts +0 -2
  41. package/out-tsc/vitest/__tests__/runtime/steps.test.d.ts.map +0 -1
  42. package/out-tsc/vitest/__tests__/runtime/utils.test.d.ts +0 -2
  43. package/out-tsc/vitest/__tests__/runtime/utils.test.d.ts.map +0 -1
  44. package/out-tsc/vitest/__tests__/types/dsl-types.test-d.d.ts +0 -2
  45. package/out-tsc/vitest/__tests__/types/dsl-types.test-d.d.ts.map +0 -1
  46. package/out-tsc/vitest/__tests__/types/example-flow.test-d.d.ts +0 -2
  47. package/out-tsc/vitest/__tests__/types/example-flow.test-d.d.ts.map +0 -1
  48. package/out-tsc/vitest/__tests__/types/extract-flow-input.test-d.d.ts +0 -2
  49. package/out-tsc/vitest/__tests__/types/extract-flow-input.test-d.d.ts.map +0 -1
  50. package/out-tsc/vitest/__tests__/types/extract-flow-steps.test-d.d.ts +0 -2
  51. package/out-tsc/vitest/__tests__/types/extract-flow-steps.test-d.d.ts.map +0 -1
  52. package/out-tsc/vitest/__tests__/types/getStepDefinition.test-d.d.ts +0 -2
  53. package/out-tsc/vitest/__tests__/types/getStepDefinition.test-d.d.ts.map +0 -1
  54. package/out-tsc/vitest/__tests__/types/step-input.test-d.d.ts +0 -2
  55. package/out-tsc/vitest/__tests__/types/step-input.test-d.d.ts.map +0 -1
  56. package/out-tsc/vitest/__tests__/types/step-output.test-d.d.ts +0 -2
  57. package/out-tsc/vitest/__tests__/types/step-output.test-d.d.ts.map +0 -1
  58. package/out-tsc/vitest/tsconfig.spec.tsbuildinfo +0 -1
  59. package/out-tsc/vitest/vite.config.d.ts +0 -3
  60. package/out-tsc/vitest/vite.config.d.ts.map +0 -1
  61. package/project.json +0 -28
  62. package/prompts/edge-worker-refactor.md +0 -105
  63. package/src/dsl.ts +0 -318
  64. package/src/example-flow.ts +0 -67
  65. package/src/index.ts +0 -1
  66. package/src/utils.ts +0 -84
  67. package/tsconfig.json +0 -13
  68. package/tsconfig.lib.json +0 -26
  69. package/tsconfig.spec.json +0 -35
  70. package/typecheck.log +0 -120
  71. package/vite.config.ts +0 -57
@@ -1,372 +0,0 @@
1
- # Implementation Plan for Conditional Skipping Logic
2
-
3
- Below is a comprehensive plan, split into multiple stages, to introduce and refine the “skipping logic” (`runIf` / `runUnless`) in both the database schema and the Flow DSL. At the end, you will find an “Actionable Steps” section with quick TODO items.
4
-
5
- ---
6
-
7
- ## Stage 1: Add Condition Columns and "skipped" Status
8
-
9
- ### Current State
10
-
11
- At present, our `steps` table does not store conditional logic at all. The concept of skipping a step is not directly represented, and we only have these step statuses in `step_states`:
12
-
13
- - `created`
14
- - `started`
15
- - `completed`
16
- - `failed`
17
-
18
- Similarly, the `steps` table has no columns for conditions like `runIf` or `runUnless`.
19
-
20
- ### Changes in This Stage
21
-
22
- **1.** We will add two columns to the `steps` table for storing condition data:
23
-
24
- - `run_if_condition jsonb` (optional)
25
- - `run_unless_condition jsonb` (optional)
26
-
27
- **2.** We will add a new state `skipped` in `step_states.status`. This allows us to mark a step as definitively skipped (distinct from `completed` or `failed`).
28
-
29
- **3.** We will allow `runIf` and `runUnless` to co-exist. If both are given, the engine must pass both conditions (i.e., `runIf(verbose) && !runUnless(verbose)`) for the step to run.
30
-
31
- ### End Result
32
-
33
- After Stage 1:
34
-
35
- - The database schema can store conditions in `steps.run_if_condition` and `steps.run_unless_condition`.
36
- - We can mark steps as `skipped` in `step_states` when conditions do not match, making the skip “visible” at runtime.
37
-
38
- ### Code Snippets
39
-
40
- #### 1. Add Condition Columns
41
-
42
- Below is an example migration snippet to add new columns to the `steps` table:
43
-
44
- ```sql
45
- -- Example: Add run_if_condition and run_unless_condition columns to steps
46
- ALTER TABLE pgflow.steps
47
- ADD COLUMN run_if_condition JSONB DEFAULT NULL;
48
-
49
- ALTER TABLE pgflow.steps
50
- ADD COLUMN run_unless_condition JSONB DEFAULT NULL;
51
- ```
52
-
53
- > **Commentary**:
54
- >
55
- > - We choose `JSONB` to store advanced or structured conditions.
56
- > - Defaulting to `NULL` means “no condition.”
57
-
58
- #### 2. Introduce Skipped Status
59
-
60
- We need to update checks in `step_states` so we can store `skipped`:
61
-
62
- ```sql
63
- -- Example: Update step_states to allow 'skipped' as a valid status
64
- ALTER TABLE pgflow.step_states
65
- DROP CONSTRAINT step_states_status_check,
66
- ADD CONSTRAINT step_states_status_check CHECK (
67
- status IN ('created', 'started', 'completed', 'failed', 'skipped')
68
- );
69
- ```
70
-
71
- If your existing constraint is named differently, adjust accordingly.
72
-
73
- ### Alternatives That We Dismissed
74
-
75
- - **Storing conditions in `step_states`:** This would bloat runtime data. Since conditions are definitions rather than run-specific, storing them in `steps` is more appropriate.
76
- - **Using a single JSON column for both `runIf` and `runUnless`:** Doing so would complicate parsing logic. Separate columns are simpler to handle (they can coexist without confusion).
77
-
78
- ### Additional Bonus Ideas
79
-
80
- - **Add a short text column for condition expressions**: If you eventually want to store a small “human-readable” string describing the condition, it’s quite cheap to add a text column that mirrors the JSON structure or a snippet of code.
81
- - **Add an indexed expression on `run_if_condition`** if you plan to do partial lookups or advanced queries. This is easy to maintain if you rely on standard Postgres JSON indexing.
82
-
83
- ---
84
-
85
- ## Stage 2: Extend Flow DSL to Accept `runIf` and `runUnless`
86
-
87
- ### Current State
88
-
89
- Right now, the Flow DSL does not parse or accept `runIf` / `runUnless` fields on `.step()` definitions. It only supports basic `dependsOn`, `maxAttempts`, etc.
90
-
91
- ### Changes in This Stage
92
-
93
- **1.** Update your Flow DSL to allow:
94
-
95
- ```ts
96
- flow.step(
97
- {
98
- slug: 'myStep',
99
- runIf: { run: { userIsVIP: true } },
100
- runUnless: { run: { userIsDisabled: true } },
101
- },
102
- handlerFunction
103
- );
104
- ```
105
-
106
- **2.** During flow compilation, store these objects in the newly created DB columns (`run_if_condition`, `run_unless_condition`) via your `add_step` routine.
107
-
108
- ### End Result
109
-
110
- - The DSL can safely parse `runIf` and `runUnless` (both optional).
111
- - If both exist, we don’t treat them as mutually exclusive; we simply store both in the database. They are combined (the step must satisfy the `runIf` condition and fail the `runUnless` condition to run).
112
-
113
- ### Code Snippets
114
-
115
- Below is a conceptual snippet showing how you might adapt your `addStep` function in TypeScript to include condition data:
116
-
117
- ```ts
118
- interface StepOptions {
119
- slug: string;
120
- dependsOn?: string[];
121
- runIf?: Record<string, any>; // or a more structured type
122
- runUnless?: Record<string, any>;
123
- maxAttempts?: number;
124
- baseDelay?: number;
125
- timeout?: number;
126
- // ...
127
- }
128
-
129
- export async function addStepToDB(flowSlug: string, opts: StepOptions) {
130
- const { slug, dependsOn, runIf, runUnless, maxAttempts, baseDelay, timeout } =
131
- opts;
132
-
133
- // Convert the runIf/runUnless objects to JSON if needed
134
- const runIfJSON = runIf ? JSON.stringify(runIf) : null;
135
- const runUnlessJSON = runUnless ? JSON.stringify(runUnless) : null;
136
-
137
- // Insert into DB. Sample with direct SQL or a query builder:
138
- await db.query(
139
- `
140
- SELECT pgflow.add_step(
141
- $1, -- flow_slug
142
- $2, -- step_slug
143
- $3, -- deps_slugs
144
- $4, -- max_attempts
145
- $5, -- base_delay
146
- $6 -- timeout
147
- )
148
- `,
149
- [flowSlug, slug, dependsOn || [], maxAttempts, baseDelay, timeout]
150
- );
151
-
152
- // Now update the runIf and runUnless columns (or inline them in the add_step function)
153
- await db.query(
154
- `
155
- UPDATE pgflow.steps
156
- SET run_if_condition = $1::jsonb,
157
- run_unless_condition = $2::jsonb
158
- WHERE flow_slug = $3 AND step_slug = $4
159
- `,
160
- [runIfJSON, runUnlessJSON, flowSlug, slug]
161
- );
162
- }
163
- ```
164
-
165
- > **Commentary**:
166
- >
167
- > - You can either expand `pgflow.add_step` to include the condition columns or do a second `UPDATE` as shown. The “second update” method is a typical approach if you want to keep your existing `pgflow.add_step` signature minimal.
168
-
169
- ### Alternatives That We Dismissed
170
-
171
- - **Force the user to define a single condition object**: We want to keep `runIf` and `runUnless` separate to clarify their usage and keep the logic straightforward.
172
-
173
- ### Additional Bonus Ideas
174
-
175
- - **Validate the condition schema**: You could easily add a small function that checks the structure of the `runIf` / `runUnless` JSON and ensures no invalid patterns. This is a cheap addition that can prevent mistakes in flow definitions.
176
- - **Generate TypeScript types from the condition**: If you want strong typed checks, you could use a custom type or a small library (like `zod`) to define the shape of your conditions.
177
-
178
- ---
179
-
180
- ## Stage 3: Implement the Skipping Logic in the Engine
181
-
182
- ### Current State
183
-
184
- Our engine logic (e.g. `start_ready_steps`, `complete_task`, etc.) does not evaluate conditions. Steps automatically move from `created` to `started` if their dependencies are completed.
185
-
186
- ### Changes in This Stage
187
-
188
- **1.** **Teaching the Engine to Evaluate Conditions**
189
-
190
- - Before a step transitions from `created` → `started`, we:
191
- 1. Compute the step input (merging run input + outputs of dependencies).
192
- 2. Check `run_if_condition` (if present) → must be satisfied.
193
- 3. Check `run_unless_condition` (if present) → must _not_ be satisfied.
194
- - If the final result is “does not pass,” mark the step as `skipped`.
195
-
196
- **2.** **Skipping All Dependents**
197
-
198
- - If we skip a step, we also skip all steps that depend on it, transitively. This can be done by:
199
- - Marking the step `skipped`.
200
- - Setting `remaining_tasks = 0` (so it’s not started).
201
- - Recursively marking all children that have this step as a dependency (or waiting until those children check their own conditions and find a missing dependency output).
202
- - The simplest approach is an immediate cascade: once we skip a step, we locate all steps that depend on it, set their states to `skipped`, and continue recursively.
203
-
204
- ### End Result
205
-
206
- - Steps are conditionally skipped if `runIf` / `runUnless` doesn’t match.
207
- - Once a step is marked skipped, all of its downstream steps are also skipped.
208
- - This pairs neatly with future subflows logic, so skipping an entry step to a subflow means the entire subflow is effectively skipped.
209
-
210
- ### Code Snippets
211
-
212
- Below is a conceptual pseudo-SQL snippet you could add to `start_ready_steps` or a similar function:
213
-
214
- ```sql
215
- -- Pseudo-logic in start_ready_steps or a new function check_conditions_for_step
216
- -- Right when we pick "ready_steps" that have remaining_deps=0,
217
- -- we do an additional check on run_if_condition and run_unless_condition.
218
-
219
- WITH step_def AS (
220
- SELECT s.run_if_condition, s.run_unless_condition
221
- FROM pgflow.steps s
222
- WHERE s.flow_slug = step_state.flow_slug
223
- AND s.step_slug = step_state.step_slug
224
- ),
225
- input_data AS (
226
- -- Build the step input the same way poll_for_tasks does
227
- SELECT ... merged_input ...
228
- )
229
- SELECT
230
- -- Evaluate condition
231
- CASE
232
- WHEN step_def.run_if_condition IS NOT NULL
233
- AND NOT pgflow.evaluate_json_condition(input_data.merged_input, step_def.run_if_condition)
234
- THEN true -- should skip?
235
- WHEN step_def.run_unless_condition IS NOT NULL
236
- AND pgflow.evaluate_json_condition(input_data.merged_input, step_def.run_unless_condition)
237
- THEN true -- should skip?
238
- ELSE false
239
- END as should_skip
240
- FROM step_def, input_data;
241
-
242
- -- If should_skip = true, mark the step as "skipped" (and cascade skip).
243
- ```
244
-
245
- You would then do:
246
-
247
- ```sql
248
- UPDATE pgflow.step_states
249
- SET status = 'skipped',
250
- remaining_tasks = 0
251
- WHERE run_id = <...> AND step_slug = <...>;
252
- ```
253
-
254
- …and proceed to recursively skip or not schedule the dependents.
255
-
256
- > **Commentary**:
257
- >
258
- > - The function `evaluate_json_condition` is an example utility or snippet to do partial matching. The simplest approach might be a containment check `input @> condition` for `runIf`, but real use cases may require more advanced logic.
259
- > - For `runUnless`, invert that check.
260
-
261
- ### Alternatives That We Dismissed
262
-
263
- - **Deferring skip decisions until the worker**: We want to skip in the DB itself to avoid scheduling tasks at all. Letting the worker do skip checks is feasible but complicates logic, because tasks would appear “started” even if we intend to skip them.
264
-
265
- ### Additional Bonus Ideas
266
-
267
- - **Add a dedicated skip function**: e.g., `pgflow.skip_step(run_id, step_slug)`, which handles the cascade by walking the `deps` table. This can be re-used in other automation or for manually skipping certain parts of a flow.
268
- - **Add a column to store "skip reason"** if you want to see _why_ a step was skipped. This can be as simple as a text column stating “Condition not met.”
269
-
270
- ---
271
-
272
- ## Stage 4: Evaluating Whether to Store Task Outputs in `step_states`
273
-
274
- ### Current State
275
-
276
- Currently, **step outputs** live in `step_tasks.output` when a task completes. We do not store them in `step_states`, so if you want to see a step’s final output, you must query the tasks table (and typically look up the single “completed” task).
277
-
278
- ### Changes in This Stage
279
-
280
- - If we decide to store outputs in `step_states.output`, we would add a new column, for example:
281
- ```sql
282
- ALTER TABLE pgflow.step_states
283
- ADD COLUMN output jsonb DEFAULT NULL;
284
- ```
285
- - Upon `complete_task`, we would also update `step_states.output` with the same data.
286
-
287
- ### End Result
288
-
289
- - **Pros**:
290
- - Easier to query the final output of a step; you can see it directly on `step_states` without additional joins.
291
- - Potentially simpler logic for building subflow or advanced branching.
292
- - **Cons**:
293
- - Data duplication (the same JSON also lives in `step_tasks.output`).
294
- - Slightly more overhead on `complete_task` updates (we store output in two places).
295
-
296
- ### Code Snippet Example
297
-
298
- Below is an example of how you might add it to the `complete_task` function:
299
-
300
- ```sql
301
- -- In the relevant part of complete_task:
302
- WITH ...
303
- step_state AS (
304
- UPDATE pgflow.step_states
305
- SET
306
- status = CASE
307
- WHEN remaining_tasks = 1 THEN 'completed'
308
- ELSE status
309
- END,
310
- remaining_tasks = remaining_tasks - 1,
311
- output = complete_task.output -- <--- new line to store output
312
- WHERE pgflow.step_states.run_id = complete_task.run_id
313
- AND pgflow.step_states.step_slug = complete_task.step_slug
314
- RETURNING pgflow.step_states.*
315
- )
316
- ...
317
- ```
318
-
319
- ### Alternatives That We Dismissed
320
-
321
- - **Not storing outputs**: We already are storing them, but only in `step_tasks`. That might be enough if your usage is straightforward.
322
-
323
- ### Additional Bonus Ideas
324
-
325
- - **Store partial outputs**: If you have steps that produce large or multiple results, you could store a summary or hash in `step_states.output` while the full result remains in `step_tasks.output`.
326
- - **Prune or archive outputs**: If data grows large, you might add a policy to prune older outputs from `step_tasks` but keep a final summary in `step_states`.
327
-
328
- ---
329
-
330
- ## Actionable Steps
331
-
332
- Below is the TODO-like checklist you can use for your Monday planning. Each item references the most relevant section above.
333
-
334
- - TODO Update schema to include condition columns
335
-
336
- - ```sql
337
- -- example sql
338
- ALTER TABLE pgflow.steps
339
- ADD COLUMN run_if_condition JSONB DEFAULT NULL,
340
- ADD COLUMN run_unless_condition JSONB DEFAULT NULL;
341
- ```
342
- - see [link to relevant section](#stage-1-add-condition-columns-and-skipped-status)
343
-
344
- - TODO Allow "skipped" in step_states.status
345
-
346
- - ```sql
347
- ALTER TABLE pgflow.step_states
348
- DROP CONSTRAINT step_states_status_check,
349
- ADD CONSTRAINT step_states_status_check CHECK (
350
- status IN ('created', 'started', 'completed', 'failed', 'skipped')
351
- );
352
- ```
353
- - see [link to relevant section](#stage-1-add-condition-columns-and-skipped-status)
354
-
355
- - TODO Extend DSL to parse runIf / runUnless and store them in DB
356
-
357
- - see [link to relevant section](#stage-2-extend-flow-dsl-to-accept-runif-and-rununless)
358
-
359
- - TODO Implement skip logic in start_ready_steps (or a new function)
360
-
361
- - see [link to relevant section](#stage-3-implement-the-skipping-logic-in-the-engine)
362
-
363
- - TODO Implement cascade skipping approach to mark all dependents as skipped
364
-
365
- - see [link to relevant section](#stage-3-implement-the-skipping-logic-in-the-engine)
366
-
367
- - TODO (Optional) Add output column to step_states to simplify final lookups
368
- - see [link to relevant section](#stage-4-evaluating-whether-to-store-task-outputs-in-step_states)
369
-
370
- ---
371
-
372
- **With this plan in place**, you will be able to conditionally skip steps based on `runIf` / `runUnless` in a robust, extensible way, and optionally simplify step output retrieval by adding an `output` column to `step_states`. This ensures your conditional logic is easy to maintain, debug, and extend—particularly if you decide to leverage subflows or more advanced branching strategies in the future.
@@ -1,225 +0,0 @@
1
- # Brainstorm: Flow DSL → SQL, JSON Schema Generation, and Deployment Strategies
2
-
3
- This document explores how we can convert a TypeScript-based **Flow DSL** (a typed object instance describing your workflow) into a **pgflow** definition housed in PostgreSQL. We also discuss best practices for **immutable** flows, versioning via `flow_slug`, and how to handle **JSON Schema** generation for step inputs/outputs. Our goal is an **exceptional developer experience**—enabling both rapid iteration in development and safe, auditable deployments in production.
4
-
5
- Below is an outline of potential approaches, trade-offs, and new ideas inspired by similar frameworks.
6
-
7
- ---
8
-
9
- ## Why Convert a Flow DSL to SQL?
10
-
11
- 1. **Single Source of Truth**: The TypeScript DSL is the ideal developer-friendly environment (auto-complete, type inference, etc.) to define the workflow shape. However, pgflow itself requires the flow definition to be materialized as rows in the database (`flows`, `steps`, `deps`).
12
- 2. **Immutability Enforcement**: Keeping flows immutable in production simplifies the system: if we detect a new or changed flow shape, we use a new `flow_slug` rather than mutating the old definition.
13
- 3. **Visibility & Auditing**: Generating SQL migrations allows teams to see the exact changes to flows over time, fitting existing DevOps pipelines (e.g., ephemeral in dev vs. migrations in prod).
14
-
15
- ---
16
-
17
- ## High-Level Workflows
18
-
19
- ### 1. CLI Tool: “pgflow”
20
-
21
- A dedicated CLI tool could handle:
22
-
23
- 1. **Compile**: Take a `.ts` file with the Flow DSL, generate:
24
- - SQL statements for `create_flow` and `add_step`.
25
- - Optionally, **JSON Schemas** for each step’s inputs/outputs (plus the flow’s overall input and output).
26
- 2. **Deploy** (Development Mode):
27
- - Read the DSL in memory, produce SQL, and directly execute against the dev database.
28
- - Potentially auto-drop existing flows with the same slug (losing old runs, but okay for dev).
29
- 3. **Generate Migrations** (Production Mode):
30
- - Write the generated SQL (and JSON Schemas if needed) to a `.sql` file in `migrations/` for a formal, tracked deployment.
31
- - If a flow with the same `flow_slug` already exists in the DB but the shape differs, the migration fails (immutable flow violation).
32
- 4. **Check**: Compare the DSL shape to what’s in the DB:
33
- - If it differs, error out (unless you force a new slug).
34
- - If it’s the same, do nothing.
35
-
36
- **Pros**
37
- - Familiar, explicit approach—TypeScript code → generated SQL → migrations.
38
- - Integrates well with typical CI/CD tools (Liquibase, Flyway, etc.).
39
- - Clear separation of concerns around dev vs. prod.
40
-
41
- **Cons**
42
- - Requires running an extra command or hooking it into your build pipeline.
43
-
44
- ### 2. Edge Worker Auto-Registration
45
-
46
- Alternatively, the Edge Worker can auto-check if a flow is present or matches:
47
-
48
- 1. On startup or first usage of a flow, the Worker inspects the flow shape.
49
- 2. If that `flow_slug` is missing in the DB:
50
- - In **dev**, create it on the fly.
51
- - In **prod**, optionally refuse or raise an error if no flow definition is found (reducing “magic” behind the scenes).
52
- 3. If the same `flow_slug` is found but shapes differ, throw an error (honoring immutability).
53
-
54
- **Pros**
55
- - Minimal friction—no manual step needed.
56
- - Great for quick local experiments.
57
-
58
- **Cons**
59
- - Less transparent for production pipelines.
60
- - Could lead to unintentional overwrites if not carefully guarded.
61
-
62
- ### 3. Hybrid of Both
63
-
64
- Commonly, you can combine the CLI approach for local dev (auto-deploy flows on every change) with a “strict” migration approach for production. This approach ensures:
65
-
66
- - **Dev:** Speedy iteration, auto-drop for the same slug.
67
- - **Prod:** Each flow creation is an explicit migration. If shape changes, you must rename the slug or handle versioning carefully.
68
-
69
- ---
70
-
71
- ## Handling Immutability and Versioning
72
-
73
- pgflow enforces **immutable** flow definitions in production-like environments:
74
-
75
- 1. **Immutable**: Once `create_flow(flow_slug, ...)` is called, that shape in the DB cannot be replaced.
76
- 2. **Versioning**: If you need to upgrade a flow shape in a backward-incompatible way, create a new slug (e.g. `analyze_website_v2`).
77
- 3. **No “latest” Aliases**: Currently avoided to simplify behavior. Users can implement their own version-discovery logic if needed.
78
-
79
- **Why?**
80
- - Eliminates “partial-upgrades” or in-flight runs being in a broken state.
81
- - Encourages explicit version bumps, making changes discoverable and trackable.
82
- - Tends to be simpler operationally, particularly for large teams.
83
-
84
- ---
85
-
86
- ## JSON Schema Generation for Validation
87
-
88
- An emerging idea is to **auto-generate JSON Schemas** for each step’s input/output based on TypeScript’s type information. This can provide:
89
-
90
- 1. **DB-Level Validation**: Optionally store these schemas in the `flows` or `steps` table (e.g. in a `json_schemas` column) so that the database can:
91
- - Validate step output when you call `complete_task`.
92
- - Validate run input during `start_flow`.
93
- 2. **Edge Worker Validation**: The Edge Worker can use the same schemas at runtime to ensure the data processed by each step is well-formed.
94
- 3. **Documentation**: The schema acts as living documentation for what each step expects/produces.
95
-
96
- **Implementation Sketch**
97
- - During CLI compilation, parse TypeScript AST or rely on a TypeScript-to-JSON-Schema library (e.g. [typescript-json-schema](https://github.com/YousefED/typescript-json-schema)).
98
- - For each Flow:
99
- 1. **Flow Input Schema**: Generated from the `Flow<Input>()` input type.
100
- 2. **Step Output Schemas**: Generated from the return type of each step’s handler.
101
- 3. Store each schema (or a combined schema object) in an internal structure to be inserted in the DB or saved as `.json` files alongside `.sql` migrations.
102
-
103
- **Challenges**
104
- - Not all TypeScript features map perfectly to JSON Schema.
105
- - Could lead to extra complexity if not curated carefully.
106
- - Must decide if we want the DB to do the validation or if the Worker does it pre-insertion.
107
-
108
- **Value**
109
- - Certain data-sensitive or compliance-driven teams might love having strict JSON validations.
110
- - Great for debugging and ensuring you never pass unexpected data to a step.
111
- - Aligns with modern “schema-first” or “contract-based” development patterns.
112
-
113
- ---
114
-
115
- ## Detailed Example: CLI Flow with JSON Schemas
116
-
117
- 1. **Define the Flow in TypeScript**
118
- ```ts
119
- import { Flow } from "pgflow-dsl"; // hypothetical npm package
120
-
121
- type FlowInput = { url: string };
122
-
123
- // Returns { content: string; status: number }
124
- async function scrapeWebsite(args: { run: FlowInput }) { /* ... */ }
125
-
126
- // Returns { sentimentScore: number }
127
- async function analyzeSentiment(args: { run: FlowInput; website: { content: string } }) { /* ... */ }
128
-
129
- export const AnalyzeWebsiteV2 = new Flow<FlowInput>({
130
- slug: "analyze_website_v2",
131
- })
132
- .step({ slug: "website" }, scrapeWebsite)
133
- .step({ slug: "sentiment", dependsOn: ["website"] }, analyzeSentiment);
134
- ```
135
-
136
- 2. **Compile** (in dev or CI)
137
- ```bash
138
- $ pgflow compile --file=flows/AnalyzeWebsiteV2.ts --out=migrations/2023-10-01_analyze_website_v2
139
- ```
140
- This command:
141
- - Translates the flow definition into SQL that calls `pgflow.create_flow('analyze_website_v2')`, then `pgflow.add_step(...)`.
142
- - Generates JSON Schemas for:
143
- - `FlowInput = { url: string }`
144
- - Output of `scrapeWebsite` = `{ content: string; status: number }`
145
- - Output of `analyzeSentiment` = `{ sentimentScore: number }`
146
- - Writes them to:
147
- - `migrations/2023-10-01_analyze_website_v2.sql`
148
- - `migrations/2023-10-01_analyze_website_v2schemas.json`
149
-
150
- 3. **Deploy in Development**
151
- - Directly run the `.sql` file (auto-drop if you want ephemeral dev).
152
- - Insert or update the flow definitions and store the JSON Schemas in a table or a sidecar.
153
-
154
- 4. **Deploy in Production**
155
- - Use your standard “apply SQL migrations” process.
156
- - If the slug `analyze_website_v2` already exists with different shape, the migration fails, ensuring immutability.
157
- - JSON Schemas can be inserted in a robust, single transaction for end-to-end validation coverage.
158
-
159
- ---
160
-
161
- ## New Ideas for an Exceptional Developer Experience
162
-
163
- Below are some additional ideas inspired by developer-centric tools like Prisma, RedwoodJS, and Hasura:
164
-
165
- 1. **Interactive CLI** (“Flow Playground”):
166
- - A TUI (terminal UI) or small local web server that watches your DSL files for changes, regenerates flow definitions in real-time, and displays a DAG diagram (using your JSON Schemas for clarity).
167
- - Allows you to visually verify your steps, dependencies, and data shapes.
168
-
169
- 2. **Schema-based “Upgrades”**:
170
- - For advanced teams, you might track how the JSON Schemas for each step differ between “v1” and “v2.”
171
- - Let the CLI generate a helpful diff or summary of what changed in your flow shapes.
172
-
173
- 3. **Checksum-based Verification**:
174
- - The DSL → SQL + Schemas produce a unique hash of sorts.
175
- - If the deployment script sees a mismatch, it rejects the update unless a new version slug is introduced.
176
- - Minimizes accidental “silent updates.”
177
-
178
- 4. **Embeded JSON Validation**:
179
- - A variant that uses PostgreSQL’s JSON schema validation extension (if installed) or triggers.
180
- - Ensures at the DB level that `complete_task(step_output)` matches the declared JSON Schema.
181
-
182
- ---
183
-
184
- ## Quick Start: Checklist
185
-
186
- 1. **Install the CLI** (hypothetical):
187
- ```bash
188
- npm install -g pgflow-cli
189
- ```
190
- 2. **Author Your Flow** in a `.ts` file.
191
- 3. **Compile** to generate SQL + JSON Schemas:
192
- ```bash
193
- pgflow compile --file flows/myFlow.ts --out migrations/xxxx_my_flow
194
- ```
195
- 4. **Deploy**:
196
- - **Development**:
197
- ```bash
198
- pgflow deploy --dev migrations/xxxx_my_flow.sql
199
- ```
200
- - **Production**:
201
- Use your standard migration pipeline to apply the `.sql` (and `.json` if needed).
202
-
203
- 5. **Run a Workflow**:
204
- ```sql
205
- SELECT * FROM pgflow.start_flow('my_flow_slug', '{"foo":"bar"}'::jsonb);
206
- ```
207
- 6. **Watch** it execute in the [Edge Worker](../edge-worker/README.md) or poll tasks manually.
208
-
209
- ---
210
-
211
- ## Conclusion
212
-
213
- By introducing a **Flow DSL → SQL** conversion layer, **JSON Schema** generation, and a straightforward **CLI** or **auto-registration** approach, **pgflow** can stay minimal while still giving you a world-class developer experience. Flows remain **immutable** in production, with new slugs representing new shapes, ensuring no half-upgraded states mid-run.
214
-
215
- ### Key Takeaways
216
-
217
- - **MVP Focus**: Keep it simple but design for an awesome DX.
218
- - **Immutable, Versioned Flows**: A crucial architectural choice to avoid partial upgrades.
219
- - **Optional JSON Schemas**: A major value-add for teams needing data validation, auditable definitions, and strong typed documentation.
220
- - **Dev vs. Prod**:
221
- - Development: auto-recreate flows as you code.
222
- - Production: carefully migrated `.sql` and `.json` schemas.
223
- - **Potential for Growth**: Build a truly “famous” developer-friendly tool by delivering strong defaults, low friction, and the option to scale up complexity where needed.
224
-
225
- We encourage you to try out the **pgflow CLI** approach, experiment with auto-generation of JSON Schemas, and integrate them into your development flow. The result is a robust, type-safe, and auditable workflow system that’s both approachable for small teams yet powerful for enterprise-scale needs.