@yottagraph-app/aether-instructions 1.1.48 → 1.1.49
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/commands/deploy_job.md +227 -0
- package/commands/deploy_workflow.md +189 -0
- package/package.json +1 -1
|
@@ -0,0 +1,227 @@
|
|
|
1
|
+
# Deploy Compute Job
|
|
2
|
+
|
|
3
|
+
Deploy a Cloud Run Job from the `jobs/` directory via the Broadchurch Portal.
|
|
4
|
+
|
|
5
|
+
## Overview
|
|
6
|
+
|
|
7
|
+
A "compute job" is a containerized Python (or any-language) entrypoint
|
|
8
|
+
that runs on Google Cloud Run Jobs. Use it for:
|
|
9
|
+
|
|
10
|
+
- **Cron jobs** (set `schedule:` in `job.yaml` — Cloud Scheduler is wired up automatically)
|
|
11
|
+
- **Event-triggered batch work** (HTTP-triggered from your Vercel app or an Agent Engine tool)
|
|
12
|
+
- **Heavy compute** (entity enrichment, scoring, ETL, exports, aggregations)
|
|
13
|
+
- **Workflow steps** (called from a Cloud Workflow definition under `workflows/`)
|
|
14
|
+
|
|
15
|
+
This command triggers a Cloud Build → Cloud Run Job deploy through
|
|
16
|
+
the Portal. No local GCP credentials needed.
|
|
17
|
+
|
|
18
|
+
The job must live in `jobs/<name>/` with at minimum:
|
|
19
|
+
|
|
20
|
+
```
|
|
21
|
+
jobs/<name>/
|
|
22
|
+
├── main.py # Entrypoint (or any executable; see Dockerfile)
|
|
23
|
+
├── requirements.txt # Python deps (only required if no custom Dockerfile)
|
|
24
|
+
├── job.yaml # Manifest: resources, schedule, env
|
|
25
|
+
└── Dockerfile # Optional — auto-generated if missing
|
|
26
|
+
```
|
|
27
|
+
|
|
28
|
+
**Prerequisite:** The project must have a valid `broadchurch.yaml`
|
|
29
|
+
(created during provisioning).
|
|
30
|
+
|
|
31
|
+
---
|
|
32
|
+
|
|
33
|
+
## Step 1: Read Configuration
|
|
34
|
+
|
|
35
|
+
Read `broadchurch.yaml` from the project root.
|
|
36
|
+
|
|
37
|
+
```bash
|
|
38
|
+
cat broadchurch.yaml
|
|
39
|
+
```
|
|
40
|
+
|
|
41
|
+
**If the file does not exist:**
|
|
42
|
+
|
|
43
|
+
> This project hasn't been provisioned yet. Create it in the Broadchurch Portal first.
|
|
44
|
+
|
|
45
|
+
Stop here.
|
|
46
|
+
|
|
47
|
+
Extract these values:
|
|
48
|
+
|
|
49
|
+
- `tenant.org_id` (tenant org ID)
|
|
50
|
+
- `gateway.url` (Portal Gateway URL)
|
|
51
|
+
|
|
52
|
+
---
|
|
53
|
+
|
|
54
|
+
## Step 2: Discover Jobs
|
|
55
|
+
|
|
56
|
+
List the directories under `jobs/`:
|
|
57
|
+
|
|
58
|
+
```bash
|
|
59
|
+
ls -d jobs/*/
|
|
60
|
+
```
|
|
61
|
+
|
|
62
|
+
**If no directories exist:**
|
|
63
|
+
|
|
64
|
+
> No jobs found. Create one by making a directory under `jobs/` with the structure above.
|
|
65
|
+
> See `docs/COMPUTE_JOBS.md` for guidance, or copy `jobs/example_job/` as a starting point.
|
|
66
|
+
|
|
67
|
+
Stop here.
|
|
68
|
+
|
|
69
|
+
**Skip `example_job`** — this is a template placeholder and should
|
|
70
|
+
never be deployed. Filter it out before proceeding.
|
|
71
|
+
|
|
72
|
+
**If multiple jobs remain:** Deploy all of them. If called interactively
|
|
73
|
+
(not from `/build_my_app`), ask the user which one to deploy.
|
|
74
|
+
|
|
75
|
+
**If only one job remains:** Proceed with it — no confirmation needed.
|
|
76
|
+
|
|
77
|
+
**Important:** Job directory names should use underscores; the deploy
|
|
78
|
+
workflow translates them to Cloud Run-friendly hyphens automatically.
|
|
79
|
+
|
|
80
|
+
---
|
|
81
|
+
|
|
82
|
+
## Step 3: Validate Job Structure
|
|
83
|
+
|
|
84
|
+
For the selected job directory, verify the required files exist:
|
|
85
|
+
|
|
86
|
+
```bash
|
|
87
|
+
ls jobs/<name>/main.py jobs/<name>/job.yaml
|
|
88
|
+
```
|
|
89
|
+
|
|
90
|
+
If `Dockerfile` exists, the deploy uses it as-is. If not, the deploy
|
|
91
|
+
auto-generates a Python 3.12 Dockerfile that runs `python main.py`.
|
|
92
|
+
|
|
93
|
+
If using the auto-Dockerfile, `requirements.txt` must also exist:
|
|
94
|
+
|
|
95
|
+
```bash
|
|
96
|
+
ls jobs/<name>/requirements.txt 2>/dev/null
|
|
97
|
+
```
|
|
98
|
+
|
|
99
|
+
Validate the manifest is well-formed:
|
|
100
|
+
|
|
101
|
+
```bash
|
|
102
|
+
yq -e '.name // ""' jobs/<name>/job.yaml >/dev/null
|
|
103
|
+
```
|
|
104
|
+
|
|
105
|
+
---
|
|
106
|
+
|
|
107
|
+
## Step 4: Ensure Code is Pushed
|
|
108
|
+
|
|
109
|
+
The deployment workflow runs on the code in the GitHub repo, not the
|
|
110
|
+
local working directory:
|
|
111
|
+
|
|
112
|
+
```bash
|
|
113
|
+
git status
|
|
114
|
+
```
|
|
115
|
+
|
|
116
|
+
**If there are uncommitted changes in `jobs/<name>/`:**
|
|
117
|
+
|
|
118
|
+
> Your job code has local changes that aren't pushed yet. The
|
|
119
|
+
> deployment will use the version on GitHub. Would you like me to
|
|
120
|
+
> commit and push first?
|
|
121
|
+
|
|
122
|
+
If yes, commit and push. If no, warn them and continue.
|
|
123
|
+
|
|
124
|
+
---
|
|
125
|
+
|
|
126
|
+
## Step 5: Trigger Deployment
|
|
127
|
+
|
|
128
|
+
Call the Portal API to trigger the deploy workflow:
|
|
129
|
+
|
|
130
|
+
```bash
|
|
131
|
+
curl -sf -X POST "<GATEWAY_URL>/api/projects/<ORG_ID>/deploy" \
|
|
132
|
+
-H "Content-Type: application/json" \
|
|
133
|
+
-d '{"type": "job", "name": "<JOB_NAME>"}'
|
|
134
|
+
```
|
|
135
|
+
|
|
136
|
+
**If this fails with 404:** The job directory may not exist on GitHub
|
|
137
|
+
yet. Push your code first.
|
|
138
|
+
|
|
139
|
+
**If this succeeds:** The Portal has triggered the `deploy-job.yml`
|
|
140
|
+
GitHub Actions workflow.
|
|
141
|
+
|
|
142
|
+
---
|
|
143
|
+
|
|
144
|
+
## Step 6: Monitor Progress
|
|
145
|
+
|
|
146
|
+
> Deployment triggered! The compute job is being deployed via GitHub Actions.
|
|
147
|
+
>
|
|
148
|
+
> - **Job:** <name>
|
|
149
|
+
> - **Workflow:** deploy-job.yml
|
|
150
|
+
>
|
|
151
|
+
> This typically takes 2-5 minutes (container build + Cloud Run Job create/update).
|
|
152
|
+
> You can monitor progress:
|
|
153
|
+
>
|
|
154
|
+
> - In the Broadchurch Portal under your project's "Jobs" tab
|
|
155
|
+
> - On GitHub: `https://github.com/<REPO>/actions`
|
|
156
|
+
>
|
|
157
|
+
> Once complete:
|
|
158
|
+
>
|
|
159
|
+
> - The job is callable via the Portal "Run now" button
|
|
160
|
+
> - If `schedule:` is set in `job.yaml`, Cloud Scheduler will trigger it automatically
|
|
161
|
+
> - Run history is visible in the Portal's "Jobs" tab
|
|
162
|
+
|
|
163
|
+
---
|
|
164
|
+
|
|
165
|
+
## Step 7: (Optional) Trigger a Test Run
|
|
166
|
+
|
|
167
|
+
After deployment, trigger an ad-hoc run to verify the job works:
|
|
168
|
+
|
|
169
|
+
```bash
|
|
170
|
+
curl -sf -X POST "<GATEWAY_URL>/api/projects/<ORG_ID>/jobs/<JOB_NAME>/run" \
|
|
171
|
+
-H "Content-Type: application/json" \
|
|
172
|
+
-d '{}'
|
|
173
|
+
```
|
|
174
|
+
|
|
175
|
+
Then poll for results:
|
|
176
|
+
|
|
177
|
+
```bash
|
|
178
|
+
curl -sf "<GATEWAY_URL>/api/projects/<ORG_ID>/jobs/<JOB_NAME>/runs" | jq '.runs[0]'
|
|
179
|
+
```
|
|
180
|
+
|
|
181
|
+
Each run has a `status` field. Terminal statuses are: `Succeeded`,
|
|
182
|
+
`Failed`, `Cancelled`.
|
|
183
|
+
|
|
184
|
+
---
|
|
185
|
+
|
|
186
|
+
## Troubleshooting
|
|
187
|
+
|
|
188
|
+
### Build fails
|
|
189
|
+
|
|
190
|
+
Check the GitHub Actions logs for the `Deploy Compute Job` workflow.
|
|
191
|
+
Common issues:
|
|
192
|
+
|
|
193
|
+
- **"requirements.txt" errors**: list every Python dep your `main.py` imports.
|
|
194
|
+
- **Custom Dockerfile**: ensure the `CMD` actually runs your entrypoint.
|
|
195
|
+
- **Memory/CPU mismatch**: Cloud Run Jobs require 1 vCPU per 4 GiB memory minimum (see Google Cloud docs).
|
|
196
|
+
|
|
197
|
+
### Job times out
|
|
198
|
+
|
|
199
|
+
Increase `task_timeout` in `job.yaml`. Cloud Run Jobs supports up to 24
|
|
200
|
+
hours per task. For longer-running work, split into shards
|
|
201
|
+
(`task_count: N` + `parallelism: N`) or escalate to GCP Batch.
|
|
202
|
+
|
|
203
|
+
### Schedule doesn't fire
|
|
204
|
+
|
|
205
|
+
Cloud Scheduler entries are named `job-<name>`. Check in the Cloud
|
|
206
|
+
Console (Cloud Scheduler → us-central1) and verify:
|
|
207
|
+
|
|
208
|
+
- The cron expression is valid
|
|
209
|
+
- The OAuth service account email matches the tenant SA
|
|
210
|
+
- The target URL points at the Cloud Run Job's `:run` endpoint
|
|
211
|
+
|
|
212
|
+
### Need to update an existing job
|
|
213
|
+
|
|
214
|
+
Just run `/deploy_job` again. It will rebuild the container, update
|
|
215
|
+
the Cloud Run Job in place, and reconcile the Cloud Scheduler entry
|
|
216
|
+
to match the current `job.yaml`.
|
|
217
|
+
|
|
218
|
+
### Want to delete a job
|
|
219
|
+
|
|
220
|
+
Delete via the Portal "Jobs" tab, or:
|
|
221
|
+
|
|
222
|
+
```bash
|
|
223
|
+
curl -sf -X DELETE "<GATEWAY_URL>/api/projects/<ORG_ID>/jobs/<JOB_NAME>"
|
|
224
|
+
```
|
|
225
|
+
|
|
226
|
+
This removes the Cloud Run Job, its Cloud Scheduler entry, and the
|
|
227
|
+
Portal's job registration.
|
|
@@ -0,0 +1,189 @@
|
|
|
1
|
+
# Deploy Cloud Workflow
|
|
2
|
+
|
|
3
|
+
Deploy a Cloud Workflow definition from the `workflows/` directory via
|
|
4
|
+
the Broadchurch Portal.
|
|
5
|
+
|
|
6
|
+
## Overview
|
|
7
|
+
|
|
8
|
+
A Cloud Workflow orchestrates one or more compute jobs. It's the
|
|
9
|
+
declarative DAG layer on top of the job runtime — useful for:
|
|
10
|
+
|
|
11
|
+
- **Multi-step pipelines** ("for each entity: enrich, score, write")
|
|
12
|
+
- **Fan-out / fan-in** patterns (parallel jobs that converge)
|
|
13
|
+
- **Retry / error-branch logic** managed by the workflow runner, not the jobs
|
|
14
|
+
- **Scheduled multi-step orchestration** (set `schedule:` in `manifest.yaml`)
|
|
15
|
+
|
|
16
|
+
Cloud Workflows handles state, retries, fan-out, and observability.
|
|
17
|
+
Your jobs stay stateless and idempotent — that's the design point.
|
|
18
|
+
|
|
19
|
+
The workflow must live in `workflows/<name>/` with:
|
|
20
|
+
|
|
21
|
+
```
|
|
22
|
+
workflows/<name>/
|
|
23
|
+
├── workflow.yaml # Cloud Workflows DSL — the DAG itself
|
|
24
|
+
└── manifest.yaml # Name, schedule, log level, schedule input
|
|
25
|
+
```
|
|
26
|
+
|
|
27
|
+
**Prerequisite:** The project must have a valid `broadchurch.yaml`,
|
|
28
|
+
and any jobs the workflow references must already be deployed via
|
|
29
|
+
`/deploy_job`.
|
|
30
|
+
|
|
31
|
+
---
|
|
32
|
+
|
|
33
|
+
## Step 1: Read Configuration
|
|
34
|
+
|
|
35
|
+
```bash
|
|
36
|
+
cat broadchurch.yaml
|
|
37
|
+
```
|
|
38
|
+
|
|
39
|
+
**If the file does not exist:** Stop and tell the user the project
|
|
40
|
+
isn't provisioned yet.
|
|
41
|
+
|
|
42
|
+
Extract:
|
|
43
|
+
|
|
44
|
+
- `tenant.org_id`
|
|
45
|
+
- `gateway.url`
|
|
46
|
+
|
|
47
|
+
---
|
|
48
|
+
|
|
49
|
+
## Step 2: Discover Workflows
|
|
50
|
+
|
|
51
|
+
```bash
|
|
52
|
+
ls -d workflows/*/
|
|
53
|
+
```
|
|
54
|
+
|
|
55
|
+
**If no directories exist:** Suggest copying `workflows/example_workflow/`
|
|
56
|
+
or creating a new one matching the structure above. Stop here.
|
|
57
|
+
|
|
58
|
+
**Skip `example_workflow`** — it's a template placeholder. Filter it out.
|
|
59
|
+
|
|
60
|
+
**If multiple remain:** Ask which to deploy.
|
|
61
|
+
**If one remains:** Proceed without confirmation.
|
|
62
|
+
|
|
63
|
+
---
|
|
64
|
+
|
|
65
|
+
## Step 3: Validate Workflow Structure
|
|
66
|
+
|
|
67
|
+
```bash
|
|
68
|
+
ls workflows/<name>/workflow.yaml workflows/<name>/manifest.yaml
|
|
69
|
+
```
|
|
70
|
+
|
|
71
|
+
Validate `workflow.yaml` is syntactically valid:
|
|
72
|
+
|
|
73
|
+
```bash
|
|
74
|
+
yq -e '.main.steps' workflows/<name>/workflow.yaml >/dev/null
|
|
75
|
+
```
|
|
76
|
+
|
|
77
|
+
Validate `manifest.yaml` has a name:
|
|
78
|
+
|
|
79
|
+
```bash
|
|
80
|
+
yq -e '.name' workflows/<name>/manifest.yaml >/dev/null
|
|
81
|
+
```
|
|
82
|
+
|
|
83
|
+
---
|
|
84
|
+
|
|
85
|
+
## Step 4: Ensure Referenced Jobs Are Deployed
|
|
86
|
+
|
|
87
|
+
Cloud Workflows can't deploy referenced jobs for you. Check the
|
|
88
|
+
workflow definition for any `googleapis.run.v1.namespaces.jobs.run`
|
|
89
|
+
calls and verify those jobs exist:
|
|
90
|
+
|
|
91
|
+
```bash
|
|
92
|
+
grep -oP 'jobs/\K[a-z0-9-]+' workflows/<name>/workflow.yaml | sort -u
|
|
93
|
+
```
|
|
94
|
+
|
|
95
|
+
For each job name found, confirm it appears in the Portal:
|
|
96
|
+
|
|
97
|
+
```bash
|
|
98
|
+
curl -sf "<GATEWAY_URL>/api/projects/<ORG_ID>/jobs" | jq '.jobs[].job_name'
|
|
99
|
+
```
|
|
100
|
+
|
|
101
|
+
If a referenced job isn't deployed yet:
|
|
102
|
+
|
|
103
|
+
> The workflow `<name>` references job `<job>` which isn't deployed.
|
|
104
|
+
> Deploy it first with `/deploy_job <job>`.
|
|
105
|
+
|
|
106
|
+
---
|
|
107
|
+
|
|
108
|
+
## Step 5: Ensure Code is Pushed
|
|
109
|
+
|
|
110
|
+
```bash
|
|
111
|
+
git status
|
|
112
|
+
```
|
|
113
|
+
|
|
114
|
+
If there are uncommitted changes in `workflows/<name>/`, prompt to
|
|
115
|
+
commit and push.
|
|
116
|
+
|
|
117
|
+
---
|
|
118
|
+
|
|
119
|
+
## Step 6: Trigger Deployment
|
|
120
|
+
|
|
121
|
+
```bash
|
|
122
|
+
curl -sf -X POST "<GATEWAY_URL>/api/projects/<ORG_ID>/deploy" \
|
|
123
|
+
-H "Content-Type: application/json" \
|
|
124
|
+
-d '{"type": "workflow", "name": "<WORKFLOW_NAME>"}'
|
|
125
|
+
```
|
|
126
|
+
|
|
127
|
+
**If 404:** The directory may not exist on GitHub yet. Push first.
|
|
128
|
+
|
|
129
|
+
**If success:** The Portal has triggered the `deploy-workflow.yml`
|
|
130
|
+
GitHub Actions workflow.
|
|
131
|
+
|
|
132
|
+
---
|
|
133
|
+
|
|
134
|
+
## Step 7: Monitor Progress
|
|
135
|
+
|
|
136
|
+
> Workflow deployment triggered.
|
|
137
|
+
>
|
|
138
|
+
> - **Workflow:** <name>
|
|
139
|
+
> - **GitHub Action:** deploy-workflow.yml
|
|
140
|
+
>
|
|
141
|
+
> Typically completes in under a minute (workflow definitions are tiny).
|
|
142
|
+
>
|
|
143
|
+
> Once deployed:
|
|
144
|
+
>
|
|
145
|
+
> - The workflow is executable via the Portal "Run now" button on the Workflows tab
|
|
146
|
+
> - If `schedule:` is set, Cloud Scheduler invokes it automatically
|
|
147
|
+
> - Execution history is in the Portal Workflows tab
|
|
148
|
+
|
|
149
|
+
---
|
|
150
|
+
|
|
151
|
+
## Step 8: (Optional) Trigger a Test Execution
|
|
152
|
+
|
|
153
|
+
```bash
|
|
154
|
+
curl -sf -X POST "<GATEWAY_URL>/api/projects/<ORG_ID>/workflows/<WORKFLOW_NAME>/run" \
|
|
155
|
+
-H "Content-Type: application/json" \
|
|
156
|
+
-d '{"input": {"limit": 10}}'
|
|
157
|
+
```
|
|
158
|
+
|
|
159
|
+
Then poll for the result:
|
|
160
|
+
|
|
161
|
+
```bash
|
|
162
|
+
curl -sf "<GATEWAY_URL>/api/projects/<ORG_ID>/workflows/<WORKFLOW_NAME>/executions" | jq '.executions[0]'
|
|
163
|
+
```
|
|
164
|
+
|
|
165
|
+
---
|
|
166
|
+
|
|
167
|
+
## Troubleshooting
|
|
168
|
+
|
|
169
|
+
### Workflow rejected with parsing errors
|
|
170
|
+
|
|
171
|
+
Cloud Workflows YAML has strict syntax. Common gotchas:
|
|
172
|
+
|
|
173
|
+
- Indentation must use spaces (no tabs)
|
|
174
|
+
- `${...}` expressions must be quoted strings in YAML
|
|
175
|
+
- `parallel for` requires `value` and `range` (or `in`) under `for:`
|
|
176
|
+
- `args:` for `googleapis.run.v1.*` must include `name: namespaces/<project>/jobs/<job>`
|
|
177
|
+
|
|
178
|
+
Reference: https://cloud.google.com/workflows/docs/reference/syntax
|
|
179
|
+
|
|
180
|
+
### Workflow times out
|
|
181
|
+
|
|
182
|
+
Cloud Workflows has a 1-year max duration but each step has its own
|
|
183
|
+
limits. Long-running step calls (`googleapis.run.v1.namespaces.jobs.run`)
|
|
184
|
+
should use `connector_params: {timeout: <seconds>}` to bound them.
|
|
185
|
+
|
|
186
|
+
### Schedule doesn't fire
|
|
187
|
+
|
|
188
|
+
Cloud Scheduler entries are named `workflow-<name>`. Verify in the
|
|
189
|
+
Cloud Console.
|