@pgflow/core 0.0.5 → 0.0.7

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (110) hide show
  1. package/{CHANGELOG.md → dist/CHANGELOG.md} +6 -0
  2. package/package.json +8 -5
  3. package/__tests__/mocks/index.ts +0 -1
  4. package/__tests__/mocks/postgres.ts +0 -37
  5. package/__tests__/types/PgflowSqlClient.test-d.ts +0 -59
  6. package/docs/options_for_flow_and_steps.md +0 -75
  7. package/docs/pgflow-blob-reference-system.md +0 -179
  8. package/eslint.config.cjs +0 -22
  9. package/example-flow.mermaid +0 -5
  10. package/example-flow.svg +0 -1
  11. package/flow-lifecycle.mermaid +0 -83
  12. package/flow-lifecycle.svg +0 -1
  13. package/out-tsc/vitest/__tests__/mocks/index.d.ts +0 -2
  14. package/out-tsc/vitest/__tests__/mocks/index.d.ts.map +0 -1
  15. package/out-tsc/vitest/__tests__/mocks/postgres.d.ts +0 -15
  16. package/out-tsc/vitest/__tests__/mocks/postgres.d.ts.map +0 -1
  17. package/out-tsc/vitest/__tests__/types/PgflowSqlClient.test-d.d.ts +0 -2
  18. package/out-tsc/vitest/__tests__/types/PgflowSqlClient.test-d.d.ts.map +0 -1
  19. package/out-tsc/vitest/tsconfig.spec.tsbuildinfo +0 -1
  20. package/out-tsc/vitest/vite.config.d.ts +0 -3
  21. package/out-tsc/vitest/vite.config.d.ts.map +0 -1
  22. package/pkgs/core/dist/index.js +0 -54
  23. package/pkgs/core/dist/pkgs/core/LICENSE.md +0 -660
  24. package/pkgs/core/dist/pkgs/core/README.md +0 -373
  25. package/pkgs/dsl/dist/index.js +0 -123
  26. package/pkgs/dsl/dist/pkgs/dsl/README.md +0 -11
  27. package/pkgs/edge-worker/dist/index.js +0 -953
  28. package/pkgs/edge-worker/dist/index.js.map +0 -7
  29. package/pkgs/edge-worker/dist/pkgs/edge-worker/LICENSE.md +0 -660
  30. package/pkgs/edge-worker/dist/pkgs/edge-worker/README.md +0 -46
  31. package/pkgs/example-flows/dist/index.js +0 -152
  32. package/pkgs/example-flows/dist/pkgs/example-flows/README.md +0 -11
  33. package/project.json +0 -125
  34. package/prompts/architect.md +0 -87
  35. package/prompts/condition.md +0 -33
  36. package/prompts/declarative_sql.md +0 -15
  37. package/prompts/deps_in_payloads.md +0 -20
  38. package/prompts/dsl-multi-arg.ts +0 -48
  39. package/prompts/dsl-options.md +0 -39
  40. package/prompts/dsl-single-arg.ts +0 -51
  41. package/prompts/dsl-two-arg.ts +0 -61
  42. package/prompts/dsl.md +0 -119
  43. package/prompts/fanout_steps.md +0 -1
  44. package/prompts/json_schemas.md +0 -36
  45. package/prompts/one_shot.md +0 -286
  46. package/prompts/pgtap.md +0 -229
  47. package/prompts/sdk.md +0 -59
  48. package/prompts/step_types.md +0 -62
  49. package/prompts/versioning.md +0 -16
  50. package/queries/fail_permanently.sql +0 -17
  51. package/queries/fail_task.sql +0 -21
  52. package/queries/sequential.sql +0 -47
  53. package/queries/two_roots_left_right.sql +0 -59
  54. package/schema.svg +0 -1
  55. package/scripts/colorize-pgtap-output.awk +0 -72
  56. package/scripts/run-test-with-colors +0 -5
  57. package/scripts/watch-test +0 -7
  58. package/src/PgflowSqlClient.ts +0 -85
  59. package/src/database-types.ts +0 -759
  60. package/src/index.ts +0 -3
  61. package/src/types.ts +0 -103
  62. package/supabase/config.toml +0 -32
  63. package/supabase/seed.sql +0 -202
  64. package/supabase/tests/add_step/basic_step_addition.test.sql +0 -29
  65. package/supabase/tests/add_step/circular_dependency.test.sql +0 -21
  66. package/supabase/tests/add_step/flow_isolation.test.sql +0 -26
  67. package/supabase/tests/add_step/idempotent_step_addition.test.sql +0 -20
  68. package/supabase/tests/add_step/invalid_step_slug.test.sql +0 -16
  69. package/supabase/tests/add_step/nonexistent_dependency.test.sql +0 -16
  70. package/supabase/tests/add_step/nonexistent_flow.test.sql +0 -13
  71. package/supabase/tests/add_step/options.test.sql +0 -66
  72. package/supabase/tests/add_step/step_with_dependency.test.sql +0 -36
  73. package/supabase/tests/add_step/step_with_multiple_dependencies.test.sql +0 -46
  74. package/supabase/tests/complete_task/archives_message.test.sql +0 -67
  75. package/supabase/tests/complete_task/completes_run_if_no_more_remaining_steps.test.sql +0 -62
  76. package/supabase/tests/complete_task/completes_task_and_updates_dependents.test.sql +0 -64
  77. package/supabase/tests/complete_task/decrements_remaining_steps_if_completing_step.test.sql +0 -62
  78. package/supabase/tests/complete_task/saves_output_when_completing_run.test.sql +0 -57
  79. package/supabase/tests/create_flow/flow_creation.test.sql +0 -27
  80. package/supabase/tests/create_flow/idempotency_and_duplicates.test.sql +0 -26
  81. package/supabase/tests/create_flow/invalid_slug.test.sql +0 -13
  82. package/supabase/tests/create_flow/options.test.sql +0 -57
  83. package/supabase/tests/fail_task/exponential_backoff.test.sql +0 -70
  84. package/supabase/tests/fail_task/mark_as_failed_if_no_retries_available.test.sql +0 -49
  85. package/supabase/tests/fail_task/respects_flow_retry_settings.test.sql +0 -48
  86. package/supabase/tests/fail_task/respects_step_retry_settings.test.sql +0 -48
  87. package/supabase/tests/fail_task/retry_task_if_retries_available.test.sql +0 -39
  88. package/supabase/tests/is_valid_slug.test.sql +0 -72
  89. package/supabase/tests/poll_for_tasks/builds_proper_input_from_deps_outputs.test.sql +0 -35
  90. package/supabase/tests/poll_for_tasks/hides_messages.test.sql +0 -35
  91. package/supabase/tests/poll_for_tasks/increments_attempts_count.test.sql +0 -35
  92. package/supabase/tests/poll_for_tasks/multiple_task_processing.test.sql +0 -24
  93. package/supabase/tests/poll_for_tasks/polls_only_queued_tasks.test.sql +0 -35
  94. package/supabase/tests/poll_for_tasks/reads_messages.test.sql +0 -38
  95. package/supabase/tests/poll_for_tasks/returns_no_tasks_if_no_step_task_for_message.test.sql +0 -34
  96. package/supabase/tests/poll_for_tasks/returns_no_tasks_if_queue_is_empty.test.sql +0 -19
  97. package/supabase/tests/poll_for_tasks/returns_no_tasks_when_qty_set_to_0.test.sql +0 -22
  98. package/supabase/tests/poll_for_tasks/sets_vt_delay_based_on_opt_timeout.test.sql +0 -41
  99. package/supabase/tests/poll_for_tasks/tasks_reapppear_if_not_processed_in_time.test.sql +0 -59
  100. package/supabase/tests/start_flow/creates_run.test.sql +0 -24
  101. package/supabase/tests/start_flow/creates_step_states_for_all_steps.test.sql +0 -25
  102. package/supabase/tests/start_flow/creates_step_tasks_only_for_root_steps.test.sql +0 -54
  103. package/supabase/tests/start_flow/returns_run.test.sql +0 -24
  104. package/supabase/tests/start_flow/sends_messages_on_the_queue.test.sql +0 -50
  105. package/supabase/tests/start_flow/starts_only_root_steps.test.sql +0 -21
  106. package/supabase/tests/step_dsl_is_idempotent.test.sql +0 -34
  107. package/tsconfig.json +0 -16
  108. package/tsconfig.lib.json +0 -26
  109. package/tsconfig.spec.json +0 -35
  110. package/vite.config.ts +0 -57
@@ -1,61 +0,0 @@
1
- const ScrapeWebsiteFlow = new Flow<Input>()
2
- .step(
3
- {
4
- slug: 'verify_status',
5
- },
6
- async (payload) => {
7
- // Placeholder function
8
- return { status: 'success' };
9
- }
10
- )
11
- .step(
12
- {
13
- slug: 'when_success',
14
- dependsOn: ['verify_status'],
15
- runIf: { verify_status: { status: 'success' } },
16
- },
17
- async (payload) => {
18
- // Placeholder function
19
- return await scrapeSubpages(
20
- payload.run.url,
21
- payload.table_of_contents.urls_of_subpages
22
- );
23
- }
24
- )
25
- .step(
26
- {
27
- slug: 'when_server_error',
28
- dependsOn: ['verify_status'],
29
- runUnless: { verify_status: { status: 'success' } },
30
- },
31
- async (payload) => {
32
- // Placeholder function
33
- return await generateSummaries(payload.subpages.contentsOfSubpages);
34
- }
35
- )
36
- .step(
37
- {
38
- slug: 'sentiments',
39
- dependsOn: ['subpages'],
40
- maxAttempts: 5,
41
- baseDelay: 10,
42
- },
43
- async (payload) => {
44
- // Placeholder function
45
- return await analyzeSentiments(payload.subpages.contentsOfSubpages);
46
- }
47
- )
48
- .step(
49
- {
50
- slug: 'save_to_db',
51
- dependsOn: ['subpages', 'summaries', 'sentiments'],
52
- },
53
- async (payload) => {
54
- // Placeholder function
55
- return await saveToDb(
56
- payload.subpages,
57
- payload.summaries,
58
- payload.sentiments
59
- );
60
- }
61
- );
package/prompts/dsl.md DELETED
@@ -1,119 +0,0 @@
1
- # Flow DSL
2
-
3
- Flow DSL is used do define shape of the flow and tie functions to particular steps.
4
-
5
- ## Full flow example
6
-
7
- ```ts
8
- const ScrapeWebsiteFlow = new Flow<Input>()
9
- .step('table_of_contents', async (payload) => {
10
- // Placeholder function
11
- return await fetchTableOfContents(payload.run.url);
12
- })
13
- .step('subpages', ['table_of_contents'], async (payload) => {
14
- // Placeholder function
15
- return await scrapeSubpages(payload.run.url, payload.table_of_contents.urls_of_subpages);
16
- })
17
- .step('summaries', ['subpages'], async (payload) => {
18
- // Placeholder function
19
- return await generateSummaries(payload.subpages.contentsOfSubpages);
20
- })
21
- .step('sentiments', ['subpages'], async (payload) => {
22
- // Placeholder function
23
- return await analyzeSentiments(payload.subpages.contentsOfSubpages);
24
- })
25
- .step('save_to_db', ['subpages', 'summaries', 'sentiments'], async (payload) => {
26
- // Placeholder function
27
- return await saveToDb(payload.subpages, payload.summaries, payload.sentiments);
28
- });
29
- ```
30
-
31
- ## Explanation
32
-
33
- This is Fluent API stype DSL but it is very simple:
34
-
35
- 1. Users create a flow by initializing a `Flow` object with a mandatory
36
- type annotation for the Flow `input` - this is the type of the payload
37
- users would start flow with and must be serializable to Json:
38
-
39
- ```ts
40
- type Input = {
41
- url: string; // url of the website to scrape
42
- };
43
-
44
- const ScrapeWebsiteFlow = new Flow<Input>()
45
- ```
46
-
47
- 2. Then they define steps by calling `.step(stepSlug: string, depsSlugs: string[], handler: Function)` method.
48
- The `depsSlugs` array can be ommited if the step has no dependencies.
49
- This kind of steps are named "root steps" and are run first and passed only the flow input payload:
50
-
51
- ```ts
52
- const ScrapeWebsiteFlow = new Flow<Input>()
53
- .step('table_of_contents', async (payload) => {
54
- const { run } = payload;
55
- // do something
56
- // make sure to return some value so next steps can use it
57
- return {
58
- urls_of_subpages,
59
- title
60
- }
61
- })
62
- ```
63
-
64
- The `payload` object always have a special key `run` which is value passed as flow input -
65
- every step can access and use it.
66
-
67
- What the step handler returns is very important!
68
- We name it `output` and it will be persisted in the the database
69
- and used as `input` for the dependent steps.
70
-
71
- It must be serializable to json.
72
-
73
- 3. Then they define dependent steps by calling `.step(stepSlug: string, depsSlugs: string[], handler: Function)` method,
74
- now providing an array of dependencies slugs: `['table_of_contents']`.
75
-
76
- ```ts
77
- .step('subpages', ['table_of_contents'], async (payload) => {
78
- const { run, urls_of_subpages } = payload;
79
- // do something
80
- // make sure to return some value so next steps can use it
81
- return {
82
- contentsOfSubpages
83
- }
84
- })
85
- ```
86
-
87
- Notice how the `payload` object got a new key `urls_of_subpages` - each dependency
88
- results (the persisted return value from handler) will get passed to `payload` under the dependency slug key.
89
-
90
- ```ts
91
- {
92
- run: { url: 'https://example.com' },
93
- table_of_contents: {
94
- urls_of_subpages: ['https://example.com/subpage1', 'https://example.com/subpage2']
95
- }
96
- }
97
- ```
98
-
99
- 4. There can be multiple steps in parallel:
100
-
101
- ```ts
102
- .step('summaries', ['subpages'], async (payload) => await doSomeStuff())
103
- .step('sentiments', ['subpages'], async (payload) => await doSomeStuff())
104
- ```
105
-
106
- 5. Steps can also depend on more than one other step:
107
-
108
- ```ts
109
- .step('save_to_db', ['subpages', 'summaries', 'sentiments'], async (payload) => await saveToDb())
110
- ```
111
-
112
- 6. When run finishes, the `output`s of steps that have no dependents will be combined
113
- together and saved as the run's `output`. This object will be built in similar
114
- way as the step `input` object, but will lack the `run` key.
115
-
116
- 7. Type Safety - all the step payloads types are inferred from the combination
117
- of Flow input, handler inferred return type and the shape of the graph.
118
-
119
- So users will always know that type is their step input.
@@ -1 +0,0 @@
1
- .file prompts/architect.md -- we are in monorepo for pgflow - postgres native workflow engine. we are not in the root, we are in pkgs/core - an sql part. sql code lives in supabase/migrations/ and tests live in supabase/tests/. you can check some info about step types and declarative sql in prompts/ folder. your jobs is to implement next step type - 'fanout_tasks' type, which just enqueues multiple tasks for a single step - one task per input array item. this step type must have a json path parameter that will tell which part of input is an array of items. you must change the code so steps with this type are handled differently in all the functions. use task_index to indicate which array item the given task is for. when compoleting task, do not proceed with completing steps etc. unless all the tasks for given step state are completed, then use task_index to order their outputs and use all those as a step output array. try to understand what needs to be done first. modify existing migrations, do not create new ones - this is unrealeased source code and we can change migrations.
@@ -1,36 +0,0 @@
1
- # JSON Schemas
2
-
3
- JSON schemas can be inferred from the steps `input` types,
4
- so it is relatively easy to build a JSON schema for each step input.
5
-
6
- The same goes for the JSON Schema for the flow input.
7
-
8
- ## Schema storage
9
-
10
- Schemas should be stored in the `pgflow.flows` and `pgflow.steps` tables.
11
-
12
- ## Schemas in versioning
13
-
14
- To make sure that slight changes in the input/output types of steps
15
- trigger a new version of the flow, we need to use the inferred schemas
16
- when generating a version hash of the flow.
17
-
18
- ## Schemas as validation
19
-
20
- We can use schemas to do data validation for step handlers:
21
-
22
- 1. Task executors can validate the runtime input payloads for handler
23
- and their output results against the schema.
24
- 2. Core SQL engine can use `pg_jsonschema` to validate the input values to flows
25
- and maybe the input values to steps and fail steps if they don't match.
26
-
27
- ## Problems
28
-
29
- Doing any JSON Schema validation in database is probably not a good idea because
30
- of performance impact it would have.
31
-
32
- Using runtime validation in Task Executors is probably good enough,
33
- with exception of validating the Flow input - you start flows less often than
34
- steps and it seems like a good idea to validate the input database-wise.
35
-
36
-
@@ -1,286 +0,0 @@
1
- Your job is to implement required SQL schemas and functions for an MVP of my open source Postgres-native workflow orchestration engine called pgflow.
2
-
3
- The main idea of the project is to keep shape of the DAG (nodes and edges) and its runtime state in the database
4
- and expose SQL functions that will allow to propagate through the state.
5
-
6
- Real work is done on the task queue workers and the functions from pgflow are only orchestrating
7
- the queue messages.
8
-
9
- Workers are supposed to call user functions with the input from the queue message,
10
- and should acknowledge the completion of the task or its failure (error thrown) by
11
- calling appropriate pgflow SQL functions.
12
-
13
- This way the orchestration is decoupled from the execution.
14
-
15
- I have a concrete implementation plan for you to follow and will unfold it
16
- step by step below.
17
-
18
- ## Assumptions/best practices
19
-
20
- ### We are building Minimal Viable Product
21
-
22
- Remember that we are building MVP and main focus should be on shipping something as soon as possible,
23
- by cutting scope, simplifying the architectures and code.
24
-
25
- But the outlined features are definitely something that we will be doing in the future.
26
- I am most certain about the foreach-array steps - this is a MUST have.
27
- So your focus should be on trying to implement the MVP but not closing the doors to the future improvements.
28
-
29
- ### Slugs
30
-
31
- We do not use serial IDs nor UUIDs for static things, we use "slugs" instead.
32
- A slug is just a string that conforms to following rules:
33
-
34
- ```sql
35
- slug is not null
36
- and slug <> ''
37
- and length(slug) <= 128
38
- and slug ~ '^[a-zA-Z_][a-zA-Z0-9_]*$';
39
- ```
40
-
41
- We use UUID for identifying particular run of the flow.
42
- But the states of steps for that particular run are not identified by separate UUIDs,
43
- but rather by a pair of run_id and step_slug. This pattern allows to easily refer
44
- to steps and flows by their slugs. **Leverage this pattern everywhere you can!**
45
-
46
- ### References/fkeys
47
-
48
- Use foreign keys everywhere to ensure consistency.
49
- Use composite foreign keys and composite primary keys composed of flow/step slugs and run_id's if needed.
50
-
51
- ### Declarative vs procedural
52
-
53
- **YOU MUST ALWAYS PRIORITIZE DECLARATIVE STYLE** and prioritize Batching operations.
54
-
55
- Avoid plpgsql as much as you can.
56
- It is important to have your DB procedures run in batched ways and use declarative rather than procedural constructs where possible:
57
-
58
- - do not ever use `language plplsql` in functions, always use `language sql`
59
- - don't do loops, do SQL statements that address multiple rows at once.
60
- - don't write trigger functions that fire for a single row, use `FOR EACH STATEMENT` instead.
61
- - don't call functions for each row in a result set, a condition, a join, or whatever; instead use functions that return `SETOF` and join against these.
62
-
63
- If you're constructing dynamic SQL, you should only ever use `%I` and `%L` when using `FORMAT` or similar; you should never see `%s` (with the very rare exception of where you're merging in another SQL fragment that you've previously formatted using %I and %L).
64
-
65
- Remember, that functions have significant overhead in Postgres - instead of factoring into lots of tiny functions, think about how to make your code more expressive so there's no need.
66
-
67
- ## Schemas
68
-
69
- ### pgflow.flows
70
-
71
- A static definition of a flow (DAG):
72
-
73
- ```sql
74
- CREATE TABLE pgflow.flows (
75
- flow_slug text PRIMARY KEY NOT NULL -- Unique identifier for the flow
76
- CHECK (is_valid_slug(flow_slug))
77
- );
78
- ```
79
-
80
- ### pgflow.steps
81
-
82
- A static definition of a step within a flow (a DAG "nodes"):
83
-
84
- ```sql
85
- CREATE TABLE pgflow.steps (
86
- flow_slug text NOT NULL REFERENCES flows (flow_slug),
87
- step_slug text NOT NULL,
88
- step_type text NOT NULL DEFAULT 'single',
89
- PRIMARY KEY (flow_slug, step_slug),
90
- CHECK (is_valid_slug(flow_slug)),
91
- CHECK (is_valid_slug(step_slug))
92
- );
93
- ```
94
-
95
- ### pgflow.deps
96
-
97
- A static definition of dependencies between steps (a DAG "edges"):
98
-
99
- ```sql
100
- CREATE TABLE pgflow.deps (
101
- flow_slug text NOT NULL REFERENCES pgflow.flows (flow_slug),
102
- dep_slug text NOT NULL, -- The step that must complete first
103
- step_slug text NOT NULL, -- The step that depends on dep_slug
104
- PRIMARY KEY (flow_slug, dep_slug, step_slug),
105
- FOREIGN KEY (flow_slug, dep_slug)
106
- REFERENCES pgflow.steps (flow_slug, step_slug),
107
- FOREIGN KEY (flow_slug, step_slug)
108
- REFERENCES pgflow.steps (flow_slug, step_slug),
109
- CHECK (dep_slug != step_slug) -- Prevent self-dependencies
110
- CHECK (is_valid_slug(step_slug))
111
- );
112
- ```
113
-
114
- ### pgflow.runs
115
-
116
- A table storing runtime state of given flow.
117
- A run is identified by a `flow_slug` and `run_id`.
118
-
119
- ```sql
120
- CREATE TABLE pgflow.runs (
121
- run_id uuid PRIMARY KEY NOT NULL DEFAULT gen_random_uuid(),
122
- flow_slug text NOT NULL REFERENCES pgflow.flows (flow_slug), -- denormalized
123
- status text NOT NULL DEFAULT 'started',
124
- input jsonb NOT NULL,
125
- CHECK (status IN ('started', 'failed', 'completed'))
126
- )
127
- ```
128
-
129
- There is also `status` that currently can be started, failed or completed.
130
- );
131
-
132
- ````
133
-
134
- There is also `status` that currently can be pending, failed or completed.
135
-
136
- ### pgflow.step_states
137
-
138
- Represents a state of a particular step in a particular run.
139
-
140
- ```sql
141
-
142
- -- Step states table - tracks the state of individual steps within a run
143
- CREATE TABLE pgflow.step_states (
144
- flow_slug text NOT NULL REFERENCES pgflow.flows (flow_slug),
145
- run_id uuid NOT NULL REFERENCES pgflow.runs (run_id),
146
- step_slug text NOT NULL,
147
- status text NOT NULL DEFAULT 'created',
148
- PRIMARY KEY (run_id, step_slug),
149
- FOREIGN KEY (flow_slug, step_slug)
150
- REFERENCES pgflow.steps (flow_slug, step_slug),
151
- CHECK (status IN ('created', 'started', 'completed', 'failed'))
152
- );
153
- );
154
- ````
155
-
156
- ### pgflow.step_tasks
157
-
158
- This table is really unique and interesting. We are starting the development
159
- of the flow orchestration engine with a simple step that runs one unit of work.
160
-
161
- But I imagine we would suppport additional types of steps, like:
162
-
163
- - a step that requires input array and enqueues a task per array item, so they are created in parallel
164
- - a step that runs some preprocessing/postprocessing in an additional task
165
-
166
- So in order to accomodate this, we need an additional layer between step_state and
167
- an actual task queue, in order to track which messages belong to which steps,
168
- in case there are more than 1 unit of work for given step.
169
-
170
- ```sql
171
- -- Executio logs table - tracks the task of individual steps
172
- CREATE TABLE pgflow.step_tasks (
173
- flow_slug text NOT NULL REFERENCES pgflow.flows (flow_slug),
174
- step_slug text NOT NULL,
175
- run_id uuid NOT NULL REFERENCES pgflow.runs (run_id),
176
- status text NOT NULL DEFAULT 'queued',
177
- input jsonb NOT NULL, -- payload that will be passed to queue message
178
- output jsonb, -- like step_result but for task, can store result or error/stacktrace
179
- message_id bigint, -- an id of the queue message
180
- CONSTRAINT step_tasks_pkey PRIMARY KEY (run_id, step_slug),
181
- FOREIGN KEY (run_id, step_slug)
182
- REFERENCES pgflow.step_states (run_id, step_slug),
183
- CHECK (status IN ('queued', 'started', 'failed', 'completed')),
184
- CHECK (is_valid_slug(flow_slug)),
185
- CHECK (is_valid_slug(step_slug))
186
- );
187
- ```
188
-
189
- ## Typescript DSL, topological ordering and acyclicity validation
190
-
191
- The simple typescript DSL will be created that will have string typing
192
- and will enforce adding steps in a topological order, preventing
193
- cycles by the strict ordering of the steps addition.
194
-
195
- Typescript DSL looks like this:
196
-
197
- ```ts
198
- const BasicFlow = new Flow<string>()
199
- .step('root', ({ run }) => {
200
- return `[${run}]r00t`;
201
- })
202
- .step('left', ['root'], ({ root: r }) => {
203
- return `${r}/left`;
204
- })
205
- .step('right', ['root'], ({ root: r }) => {
206
- return `${r}/right`;
207
- })
208
- .step('end', ['left', 'right'], ({ left, right, run }) => {
209
- return `<${left}> and <${right}> of (${run})`;
210
- });
211
- ```
212
-
213
- This will be compiled to a simple SQL calling SQL function `pgflow.add_step(flow_slug, step_slug, dep_step_slugs[])`:
214
-
215
- ```sql
216
- SELECT pgflow.add_step('basic', 'root', ARRAY[]::text[]);
217
- SELECT pgflow.add_step('basic', 'left', ARRAY['root']);
218
- SELECT pgflow.add_step('basic', 'right', ARRAY['root']);
219
- SELECT pgflow.add_step('basic', 'end', ARRAY['left', 'right']);
220
- ```
221
-
222
- ## SQL functions API
223
-
224
- This describes public SQL functions that are available to developer using pgflow
225
- and to the workers.
226
-
227
- Developer calls `start_flow` and rest is called by the workers.
228
-
229
- ### pgflow.start_flow(flow_slug::text, input::jsonb)
230
-
231
- This function is used to start a flow.
232
- It should work like this:
233
-
234
- - create a new `pgflow.runs` row for given flow_slug
235
- - create all the `pgflow.step_states` rows corresponding to the steps in the flow
236
- - find root steps (ones without dependencies) and call "start_step" on each of them
237
-
238
- ### pgflow.start_step(run_id::uuid, step_slug::text)
239
-
240
- This function is called by start_flow but also by complete_step_task (or somewhere near its call)
241
- when worker acknowledges the step_task completion and it is detected, that there are ready dependant
242
- steps to be started.
243
-
244
- It should probably call start_step_task under the hood, which will:
245
-
246
- - updating step_state status/timestamps
247
- - creating a step_task row
248
- - enqueueing a queue message for this step_task
249
-
250
- For other step types, like array/foreach, it would probably call the step_task
251
- for each array item, so more than one step task is created and more than one message is enqueued.
252
-
253
- ### pgflow.start_step_task(run_id::uuid, step_slug::text, task_id::bigint)
254
-
255
- I am not yet sure how this will work for other step types that will need more step tasks.
256
- But probably each step type would have its own implementation of this function,
257
- and a simple step type will just create a new step_task row and enqueue it.
258
-
259
- But an array/foreach step type would need a different implementation.
260
- Would need to check the input for the step which is an array, and would
261
- create a new step_task for each array item and enqueue as many messages as there are items in the array.
262
-
263
- ### pgflow.complete_step_task(run_id::uuid, step_slug::text, output::jsonb)
264
-
265
- This will be called by the worker when a step_task is completed.
266
- It will work like this in the simplified version when one step_state corresponds to one step_task:
267
-
268
- - it marks step_task as completed, saving the output
269
- - it in turns mark step_state as completed, saving the output
270
- - then it should check for any dependant steps (steps that depend on just completed step) in the same run
271
- - it should then check if any of those dependant steps are "ready" - meaning, all their dependencies are completed
272
- - for each of those
273
-
274
- I am not yet sure how this will work for other step types that will need more step tasks.
275
- Probably each step type would have its own implementation of this function,
276
- so a simple step will just call complete_step_state when complete_step_task is called.
277
-
278
- An array/foreach step type would need a different implementation.
279
- Would probably need to check if other step_tasks are still pending.
280
- If all are already completed, it would just call complete_step_state,
281
- otherwise it will just continue, so other (last) step task can complete the step state.
282
-
283
- ### pgflow.fail_step_task(run_id::uuid, step_slug::text, error::jsonb)
284
-
285
- This is very similar to complete_step_task, but it will mark step_task as failed,
286
- will save error message and will call fail_step_state instead of complete_step_state.