npm - @pgflow/core - Versions diffs - 0.1.18 → 0.1.20 - Mend

@pgflow/core 0.1.18 → 0.1.20

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (22) hide show

package/README.md +31 -19
package/dist/ATLAS.md +32 -0
package/dist/CHANGELOG.md +16 -0
package/dist/README.md +31 -19
package/dist/database-types.d.ts +116 -45
package/dist/database-types.d.ts.map +1 -1
package/dist/database-types.js +8 -1
package/dist/package.json +2 -2
package/dist/supabase/migrations/20250429164909_pgflow_initial.sql +579 -0
package/package.json +3 -3
package/dist/supabase/migrations/000000_schema.sql +0 -149
package/dist/supabase/migrations/000005_create_flow.sql +0 -29
package/dist/supabase/migrations/000010_add_step.sql +0 -48
package/dist/supabase/migrations/000015_start_ready_steps.sql +0 -45
package/dist/supabase/migrations/000020_start_flow.sql +0 -46
package/dist/supabase/migrations/000030_read_with_poll_backport.sql +0 -70
package/dist/supabase/migrations/000040_poll_for_tasks.sql +0 -100
package/dist/supabase/migrations/000045_maybe_complete_run.sql +0 -30
package/dist/supabase/migrations/000050_complete_task.sql +0 -98
package/dist/supabase/migrations/000055_calculate_retry_delay.sql +0 -11
package/dist/supabase/migrations/000060_fail_task.sql +0 -124
package/dist/supabase/migrations/000_edge_worker_initial.sql +0 -86

package/README.md CHANGED Viewed

@@ -6,6 +6,10 @@ PostgreSQL-native workflow engine for defining, managing, and tracking DAG-based
 > This project is licensed under [AGPL v3](./LICENSE.md) license and is part of **pgflow** stack.
 > See [LICENSING_OVERVIEW.md](../../LICENSING_OVERVIEW.md) in root of this monorepo for more details.
+> [!WARNING]
+> This project uses [Atlas](https://atlasgo.io/docs) to manage the schemas and migrations.
+> See [ATLAS.md](ATLAS.md) for more details.
 ## Table of Contents
 - [Overview](#overview)
@@ -56,10 +60,10 @@ The actual execution of workflow tasks is handled by the [Edge Worker](../edge-w
 ### Schema Design
-[Schema ERD Diagram (click to enlarge)](./schema.svg)
+[Schema ERD Diagram (click to enlarge)](./assets/schema.svg)
-<a href="./schema.svg">
-  <img src="./schema.svg" alt="Schema ERD Diagram" width="25%" height="25%">
+<a href="./assets/schema.svg">
+  <img src="./assets/schema.svg" alt="Schema ERD Diagram" width="25%" height="25%">
 </a>
 ---
@@ -87,23 +91,24 @@ The SQL Core handles the workflow lifecycle through these key operations:
 3. **Task Management**: The [Edge Worker](../edge-worker/README.md) polls for available tasks using `poll_for_tasks`
 4. **State Transitions**: When the Edge Worker reports back using `complete_task` or `fail_task`, the SQL Core handles state transitions and schedules dependent steps
-[Flow lifecycle diagram (click to enlarge)](./flow-lifecycle.svg)
+[Flow lifecycle diagram (click to enlarge)](./assets/flow-lifecycle.svg)
-<a href="./flow-lifecycle.svg"><img src="./flow-lifecycle.svg" alt="Flow Lifecycle" width="25%" height="25%"></a>
+<a href="./assets/flow-lifecycle.svg"><img src="./assets/flow-lifecycle.svg" alt="Flow Lifecycle" width="25%" height="25%"></a>
 ## Example flow and its life
-Let's walk through creating and running a workflow that fetches a website,
+Let's walk through creating and running a workflow that fetches a website,
 does summarization and sentiment analysis in parallel steps
 and saves the results to a database.
-![example flow graph](./example-flow.svg)
+![example flow graph](./assets/example-flow.svg)
 ### Defining a Workflow
 Workflows are defined using two SQL functions: `create_flow` and `add_step`.
 In this example, we'll create a workflow with:
 - `website` as the entry point ("root step")
 - `sentiment` and `summary` as parallel steps that depend on `website`
 - `saveToDb` as the final step, depending on both parallel steps
@@ -122,7 +127,7 @@ SELECT pgflow.add_step('analyze_website', 'saveToDb', deps_slugs => ARRAY['senti
 > [!NOTE]
 > You can have multiple "root steps" in a workflow. You can even create a root-steps-only workflow
-> to process a single input in parallel, because at the end, all of the outputs from steps
+> to process a single input in parallel, because at the end, all of the outputs from steps
 > that does not have dependents ("final steps") are aggregated and saved as run's `output`.
 ### Starting a Workflow Run
@@ -131,16 +136,17 @@ To start a workflow, call `start_flow` with a flow slug and input arguments:
 ```sql
 SELECT * FROM pgflow.start_flow(
-  flow_slug => 'analyze_website',
+  flow_slug => 'analyze_website',
   input => '{"url": "https://example.com"}'::jsonb
 );
---     run_id  | flow_slug       | status  |  input                         | output | remaining_steps
+--     run_id  | flow_slug       | status  |  input                         | output | remaining_steps
 -- ------------+-----------------+---------+--------------------------------+--------+-----------------
 --  <run uuid> | analyze_website | started | {"url": "https://example.com"} | [NULL] |               4
 ```
 When a workflow starts:
 - A new `run` record is created
 - Initial states for all steps are created
 - Root steps are marked as `started`
@@ -187,6 +193,7 @@ SELECT pgflow.complete_task(
 ```
 When a task completes:
 1. The task status is updated to 'completed' and the output is saved
 2. The message is archived in PGMQ
 3. The step state is updated to 'completed'
@@ -246,6 +253,7 @@ SELECT pgflow.add_step(
 ```
 The system applies exponential backoff for retries using the formula:
 ```
 delay = base_delay * (2 ^ attempts_count)
 ```
@@ -283,22 +291,25 @@ type Input = {
 };
 const AnalyzeWebsite = new Flow<Input>({
-  slug: "analyze_website",
+  slug: 'analyze_website',
   maxAttempts: 3,
   baseDelay: 5,
   timeout: 10,
 })
-  .step({ slug: "website" }, async (input) => await scrapeWebsite(input.run.url))
   .step(
-    { slug: "sentiment", dependsOn: ["website"], timeout: 30, maxAttempts: 5 },
+    { slug: 'website' },
+    async (input) => await scrapeWebsite(input.run.url)
+  )
+  .step(
+    { slug: 'sentiment', dependsOn: ['website'], timeout: 30, maxAttempts: 5 },
     async (input) => await analyzeSentiment(input.website.content)
   )
   .step(
-    { slug: "summary", dependsOn: ["website"] },
+    { slug: 'summary', dependsOn: ['website'] },
     async (input) => await summarizeWithAI(input.website.content)
   )
   .step(
-    { slug: "saveToDb", dependsOn: ["sentiment", "summary"] },
+    { slug: 'saveToDb', dependsOn: ['sentiment', 'summary'] },
     async (input) =>
       await saveToDb({
         websiteUrl: input.run.url,
@@ -332,6 +343,7 @@ This means your step handlers receive exactly the data they need, properly typed
 Handlers in pgflow **must return** JSON-serializable values that are captured and saved when `complete_task` is called. These outputs become available as inputs to dependent steps, allowing data to flow through your workflow pipeline.
 When a step is executed, it receives an input object where:
 - Each key is a step_slug of a completed dependency
 - Each value is that step's output
 - A special "run" key contains the original workflow input
@@ -342,8 +354,8 @@ When the `sentiment` step runs, it receives:
 ```json
 {
-  "run": {"url": "https://example.com"},
-  "website": {"content": "HTML content", "status": 200}
+  "run": { "url": "https://example.com" },
+  "website": { "content": "HTML content", "status": 200 }
 }
 ```
@@ -353,8 +365,8 @@ The `saveToDb` step depends on both `sentiment` and `summary`:
 ```json
 {
-  "run": {"url": "https://example.com"},
-  "sentiment": {"score": 0.85, "label": "positive"},
+  "run": { "url": "https://example.com" },
+  "sentiment": { "score": 0.85, "label": "positive" },
   "summary": "This website discusses various topics related to technology and innovation."
 }
 ```

package/dist/ATLAS.md ADDED Viewed

@@ -0,0 +1,32 @@
+# Atlas setup
+We use [Atlas](https://atlasgo.io/docs) to generate migrations from the declarative schemas stored in `./schemas/` folder.
+## Configuration
+The setup is configured in `atlas.hcl`.
+It is set to compare `schemas/` to what is in `supabase/migrations/`.
+### Docker dev image
+Atlas requires a dev database to be available for computing diffs.
+The database must be empty, but contain everything needed for the schemas to apply.
+We need a configured [PGMQ](https://github.com/tembo-io/pgmq) extension, which Atlas does not support
+in their dev images.
+That's why this setup relies on a custom built image `jumski/postgres-15-pgmq:latest`.
+Inspect `Dockerfile.atlas` to see how it is built.
+See also `./scripts/build-atlas-postgres-image` and `./scripts/push-atlas-postgres-image` scripts for building and pushing the image.
+## Workflow
+1. Make sure you start with a clean database (`pnpm supabase db reset`).
+1. Modify the schemas in `schemas/` to a desired state.
+1. Run `./scripts/atlas-migrate-diff <migration-name>` to create a new migration based on the diff.
+1. Run `pnpm supabase migration up` to apply the migration.
+1. In case of any errors, remove the generated migration file, make changes in `schemas/` and repeat the process.
+1. After the migration is applied, verify it does not break tests with `nx test:pgtap`

package/dist/CHANGELOG.md CHANGED Viewed

@@ -1,5 +1,21 @@
 # @pgflow/core
+## 0.1.20
+### Patch Changes
+- 09e3210: Change name of initial migration :-(
+- 985176e: Add step_index to steps and various status timestamps to runtime tables
+  - @pgflow/dsl@0.1.20
+## 0.1.19
+### Patch Changes
+- a10b442: Add minimum set of indexes
+- efbd108: Convert migrations to declarative schemas and generate initial migration
+  - @pgflow/dsl@0.1.19
 ## 0.1.18
 ### Patch Changes

package/dist/README.md CHANGED Viewed

@@ -6,6 +6,10 @@ PostgreSQL-native workflow engine for defining, managing, and tracking DAG-based
 > This project is licensed under [AGPL v3](./LICENSE.md) license and is part of **pgflow** stack.
 > See [LICENSING_OVERVIEW.md](../../LICENSING_OVERVIEW.md) in root of this monorepo for more details.
+> [!WARNING]
+> This project uses [Atlas](https://atlasgo.io/docs) to manage the schemas and migrations.
+> See [ATLAS.md](ATLAS.md) for more details.
 ## Table of Contents
 - [Overview](#overview)
@@ -56,10 +60,10 @@ The actual execution of workflow tasks is handled by the [Edge Worker](../edge-w
 ### Schema Design
-[Schema ERD Diagram (click to enlarge)](./schema.svg)
+[Schema ERD Diagram (click to enlarge)](./assets/schema.svg)
-<a href="./schema.svg">
-  <img src="./schema.svg" alt="Schema ERD Diagram" width="25%" height="25%">
+<a href="./assets/schema.svg">
+  <img src="./assets/schema.svg" alt="Schema ERD Diagram" width="25%" height="25%">
 </a>
 ---
@@ -87,23 +91,24 @@ The SQL Core handles the workflow lifecycle through these key operations:
 3. **Task Management**: The [Edge Worker](../edge-worker/README.md) polls for available tasks using `poll_for_tasks`
 4. **State Transitions**: When the Edge Worker reports back using `complete_task` or `fail_task`, the SQL Core handles state transitions and schedules dependent steps
-[Flow lifecycle diagram (click to enlarge)](./flow-lifecycle.svg)
+[Flow lifecycle diagram (click to enlarge)](./assets/flow-lifecycle.svg)
-<a href="./flow-lifecycle.svg"><img src="./flow-lifecycle.svg" alt="Flow Lifecycle" width="25%" height="25%"></a>
+<a href="./assets/flow-lifecycle.svg"><img src="./assets/flow-lifecycle.svg" alt="Flow Lifecycle" width="25%" height="25%"></a>
 ## Example flow and its life
-Let's walk through creating and running a workflow that fetches a website,
+Let's walk through creating and running a workflow that fetches a website,
 does summarization and sentiment analysis in parallel steps
 and saves the results to a database.
-![example flow graph](./example-flow.svg)
+![example flow graph](./assets/example-flow.svg)
 ### Defining a Workflow
 Workflows are defined using two SQL functions: `create_flow` and `add_step`.
 In this example, we'll create a workflow with:
 - `website` as the entry point ("root step")
 - `sentiment` and `summary` as parallel steps that depend on `website`
 - `saveToDb` as the final step, depending on both parallel steps
@@ -122,7 +127,7 @@ SELECT pgflow.add_step('analyze_website', 'saveToDb', deps_slugs => ARRAY['senti
 > [!NOTE]
 > You can have multiple "root steps" in a workflow. You can even create a root-steps-only workflow
-> to process a single input in parallel, because at the end, all of the outputs from steps
+> to process a single input in parallel, because at the end, all of the outputs from steps
 > that does not have dependents ("final steps") are aggregated and saved as run's `output`.
 ### Starting a Workflow Run
@@ -131,16 +136,17 @@ To start a workflow, call `start_flow` with a flow slug and input arguments:
 ```sql
 SELECT * FROM pgflow.start_flow(
-  flow_slug => 'analyze_website',
+  flow_slug => 'analyze_website',
   input => '{"url": "https://example.com"}'::jsonb
 );
---     run_id  | flow_slug       | status  |  input                         | output | remaining_steps
+--     run_id  | flow_slug       | status  |  input                         | output | remaining_steps
 -- ------------+-----------------+---------+--------------------------------+--------+-----------------
 --  <run uuid> | analyze_website | started | {"url": "https://example.com"} | [NULL] |               4
 ```
 When a workflow starts:
 - A new `run` record is created
 - Initial states for all steps are created
 - Root steps are marked as `started`
@@ -187,6 +193,7 @@ SELECT pgflow.complete_task(
 ```
 When a task completes:
 1. The task status is updated to 'completed' and the output is saved
 2. The message is archived in PGMQ
 3. The step state is updated to 'completed'
@@ -246,6 +253,7 @@ SELECT pgflow.add_step(
 ```
 The system applies exponential backoff for retries using the formula:
 ```
 delay = base_delay * (2 ^ attempts_count)
 ```
@@ -283,22 +291,25 @@ type Input = {
 };
 const AnalyzeWebsite = new Flow<Input>({
-  slug: "analyze_website",
+  slug: 'analyze_website',
   maxAttempts: 3,
   baseDelay: 5,
   timeout: 10,
 })
-  .step({ slug: "website" }, async (input) => await scrapeWebsite(input.run.url))
   .step(
-    { slug: "sentiment", dependsOn: ["website"], timeout: 30, maxAttempts: 5 },
+    { slug: 'website' },
+    async (input) => await scrapeWebsite(input.run.url)
+  )
+  .step(
+    { slug: 'sentiment', dependsOn: ['website'], timeout: 30, maxAttempts: 5 },
     async (input) => await analyzeSentiment(input.website.content)
   )
   .step(
-    { slug: "summary", dependsOn: ["website"] },
+    { slug: 'summary', dependsOn: ['website'] },
     async (input) => await summarizeWithAI(input.website.content)
   )
   .step(
-    { slug: "saveToDb", dependsOn: ["sentiment", "summary"] },
+    { slug: 'saveToDb', dependsOn: ['sentiment', 'summary'] },
     async (input) =>
       await saveToDb({
         websiteUrl: input.run.url,
@@ -332,6 +343,7 @@ This means your step handlers receive exactly the data they need, properly typed
 Handlers in pgflow **must return** JSON-serializable values that are captured and saved when `complete_task` is called. These outputs become available as inputs to dependent steps, allowing data to flow through your workflow pipeline.
 When a step is executed, it receives an input object where:
 - Each key is a step_slug of a completed dependency
 - Each value is that step's output
 - A special "run" key contains the original workflow input
@@ -342,8 +354,8 @@ When the `sentiment` step runs, it receives:
 ```json
 {
-  "run": {"url": "https://example.com"},
-  "website": {"content": "HTML content", "status": 200}
+  "run": { "url": "https://example.com" },
+  "website": { "content": "HTML content", "status": 200 }
 }
 ```
@@ -353,8 +365,8 @@ The `saveToDb` step depends on both `sentiment` and `summary`:
 ```json
 {
-  "run": {"url": "https://example.com"},
-  "sentiment": {"score": 0.85, "label": "positive"},
+  "run": { "url": "https://example.com" },
+  "sentiment": { "score": 0.85, "label": "positive" },
   "summary": "This website discusses various topics related to technology and innovation."
 }
 ```