cpflow 5.0.4 → 5.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/.github/workflows/cpflow-promote-staging-to-production.yml +48 -9
- data/CHANGELOG.md +14 -1
- data/Gemfile.lock +1 -1
- data/README.md +32 -11
- data/docs/ai-github-flow-prompt.md +1 -1
- data/docs/ci-automation.md +94 -45
- data/docs/commands.md +9 -3
- data/docs/postgres.md +6 -0
- data/docs/rds-private-networking.md +649 -0
- data/docs/secrets-and-env-values.md +49 -0
- data/docs/tips.md +256 -10
- data/examples/controlplane.yml +8 -0
- data/lib/command/ai_github_flow_prompt.rb +1 -1
- data/lib/command/apply_template.rb +3 -0
- data/lib/command/base.rb +69 -0
- data/lib/command/cleanup_stale_apps.rb +1 -1
- data/lib/command/delete.rb +85 -10
- data/lib/command/deploy_image.rb +30 -8
- data/lib/command/generate_github_actions.rb +6 -0
- data/lib/command/setup_app.rb +11 -2
- data/lib/core/config.rb +81 -0
- data/lib/core/controlplane.rb +15 -5
- data/lib/core/template_parser.rb +4 -0
- data/lib/cpflow/version.rb +1 -1
- data/lib/generator_templates/controlplane.yml +7 -0
- data/lib/generator_templates_sqlite/controlplane.yml +7 -0
- data/lib/github_flow_templates/.github/cpflow-help.md +35 -12
- data/lib/github_flow_templates/.github/workflows/cpflow-promote-staging-to-production.yml +583 -15
- data/lib/github_flow_templates/bin/pin-cpflow-github-ref +17 -3
- data/lib/github_flow_templates/bin/test-cpflow-github-flow +61 -9
- metadata +3 -2
|
@@ -29,6 +29,55 @@ You can do this during the initial app setup, like this:
|
|
|
29
29
|
6. Find the created secret (it will be in the `$APP_PREFIX-secrets` format) and add the secret env vars there
|
|
30
30
|
7. Use `cpln://secret/...` in the app to access the secret env vars (e.g., `cpln://secret/$APP_PREFIX-secrets.SOME_VAR`)
|
|
31
31
|
|
|
32
|
+
## Shared Secrets for Review Apps
|
|
33
|
+
|
|
34
|
+
Review apps often need access to a shared staging resource, such as one staging PostgreSQL workload or managed database.
|
|
35
|
+
Creating a database per pull request is expensive and slow, so you can create one shared org-level secret and policy,
|
|
36
|
+
then let each temporary review-app identity reveal that shared secret.
|
|
37
|
+
|
|
38
|
+
Create the shared dictionary secret and policy once in the staging org. The policy must target exactly the shared secret:
|
|
39
|
+
|
|
40
|
+
```yaml
|
|
41
|
+
kind: policy
|
|
42
|
+
name: my-app-review-database-secrets-policy
|
|
43
|
+
targetKind: secret
|
|
44
|
+
targetLinks:
|
|
45
|
+
- //secret/my-app-review-database-secrets
|
|
46
|
+
```
|
|
47
|
+
|
|
48
|
+
Then declare the grant in the review app entry in `.controlplane/controlplane.yml`:
|
|
49
|
+
|
|
50
|
+
```yaml
|
|
51
|
+
apps:
|
|
52
|
+
my-app-review:
|
|
53
|
+
match_if_app_name_starts_with: true
|
|
54
|
+
shared_secret_grants:
|
|
55
|
+
- name: database
|
|
56
|
+
secret_name: my-app-review-database-secrets
|
|
57
|
+
policy_name: my-app-review-database-secrets-policy
|
|
58
|
+
```
|
|
59
|
+
|
|
60
|
+
Use the generated placeholder in templates instead of hardcoding the secret name:
|
|
61
|
+
|
|
62
|
+
```yaml
|
|
63
|
+
env:
|
|
64
|
+
- name: DATABASE_URL
|
|
65
|
+
value: cpln://secret/{{SHARED_SECRET_DATABASE}}.DATABASE_URL
|
|
66
|
+
```
|
|
67
|
+
|
|
68
|
+
`name` must be lower snake case. It becomes `{{SHARED_SECRET_<NAME>}}`, uppercased, in templates. `secret_name`
|
|
69
|
+
and `policy_name` must be Control Plane resource names: lowercase letters, numbers, and dashes only, starting and ending
|
|
70
|
+
with a letter or number.
|
|
71
|
+
|
|
72
|
+
`cpflow setup-app` still creates the per-app secret and policy for app-specific values, and also binds the app identity
|
|
73
|
+
to every configured shared policy. `cpflow deploy-image` repairs missing shared policy bindings before workloads are
|
|
74
|
+
updated, which helps existing review apps recover after the config is added. `cpflow delete` and `cpflow cleanup-stale-apps`
|
|
75
|
+
remove those shared policy bindings when a review app is deleted.
|
|
76
|
+
|
|
77
|
+
For shared databases, keep runtime data isolated by using a per-review-app database name, schema, or tenant key. A common
|
|
78
|
+
pattern is to keep the host, user, and password in the shared secret, then have `hooks.post_creation` create the PR-specific
|
|
79
|
+
database/schema and `hooks.pre_deletion` drop it.
|
|
80
|
+
|
|
32
81
|
Here are the manual steps for reference. We recommend that you follow the steps above:
|
|
33
82
|
|
|
34
83
|
1. In the upper left of the Control Plane console, "Manage Org" menu, click on "Secrets"
|
data/docs/tips.md
CHANGED
|
@@ -1,16 +1,23 @@
|
|
|
1
1
|
# Tips
|
|
2
2
|
|
|
3
3
|
1. [GVCs vs. Orgs](#gvcs-vs-orgs)
|
|
4
|
-
2. [
|
|
5
|
-
3. [
|
|
6
|
-
4. [
|
|
7
|
-
5. [
|
|
8
|
-
6. [
|
|
9
|
-
7. [
|
|
10
|
-
|
|
11
|
-
|
|
12
|
-
|
|
13
|
-
|
|
4
|
+
2. [Heroku Mappings](#heroku-mappings)
|
|
5
|
+
3. [RAM](#ram)
|
|
6
|
+
4. [CPU](#cpu)
|
|
7
|
+
5. [Remote IP](#remote-ip)
|
|
8
|
+
6. [Secrets and ENV Values](/docs/secrets-and-env-values.md)
|
|
9
|
+
7. [CI](#ci)
|
|
10
|
+
8. [Logs](#logs)
|
|
11
|
+
9. [Memcached](#memcached)
|
|
12
|
+
10. [Sidekiq](#sidekiq)
|
|
13
|
+
- [Quieting Non-Critical Workers During Deployments](#quieting-non-critical-workers-during-deployments)
|
|
14
|
+
- [Setting Up a Pre Stop Hook](#setting-up-a-pre-stop-hook)
|
|
15
|
+
- [Setting Up a Liveness Probe](#setting-up-a-liveness-probe)
|
|
16
|
+
11. [Minimizing Review App Costs](#minimizing-review-app-costs)
|
|
17
|
+
- [Scale the Web Workload to Zero](#scale-the-web-workload-to-zero)
|
|
18
|
+
- [Delete or Pause Abandoned Apps with `cleanup-stale-apps`](#delete-or-pause-abandoned-apps-with-cleanup-stale-apps)
|
|
19
|
+
- [Pause and Resume with `ps:stop` / `ps:start`](#pause-and-resume-with-psstop--psstart)
|
|
20
|
+
12. [Useful Links](#useful-links)
|
|
14
21
|
|
|
15
22
|
## GVCs vs. Orgs
|
|
16
23
|
|
|
@@ -20,6 +27,23 @@
|
|
|
20
27
|
- You can have different images within a GVC and even within a workload. This flexibility is one of the key differences
|
|
21
28
|
compared to Heroku apps.
|
|
22
29
|
|
|
30
|
+
## Heroku Mappings
|
|
31
|
+
|
|
32
|
+
If you're coming from Heroku, these concepts map roughly as follows:
|
|
33
|
+
|
|
34
|
+
| Heroku | Control Plane |
|
|
35
|
+
| ---------------- | ----------------------------------- |
|
|
36
|
+
| App | GVC |
|
|
37
|
+
| Dyno | Replica |
|
|
38
|
+
| Procfile Process | Workload |
|
|
39
|
+
| Config Var | Secret / Environment Variable |
|
|
40
|
+
| Add-on | Managed Service or External Service |
|
|
41
|
+
| Release Phase | Deployment Workflow |
|
|
42
|
+
|
|
43
|
+
These are conceptual equivalents rather than exact matches — see [GVCs vs. Orgs](#gvcs-vs-orgs) above for one key
|
|
44
|
+
difference. For a mapping of Heroku _CLI commands_ to `cpflow`/`cpln`, see
|
|
45
|
+
[Mapping of Heroku Commands](/README.md#mapping-of-heroku-commands-to-cpflow-and-cpln).
|
|
46
|
+
|
|
23
47
|
## RAM
|
|
24
48
|
|
|
25
49
|
Any workload replica that reaches the max memory is terminated and restarted. You can configure alerts for workload
|
|
@@ -59,6 +83,23 @@ The steps for configuring an alert for workload restarts are almost identical, b
|
|
|
59
83
|
|
|
60
84
|
For more information on Grafana alerts, see: https://grafana.com/docs/grafana/latest/alerting/
|
|
61
85
|
|
|
86
|
+
## CPU
|
|
87
|
+
|
|
88
|
+
Control Plane workloads can be configured with CPU reservations and limits. If a workload consistently operates near its
|
|
89
|
+
CPU limit, request latency may increase. If CPU is configured as the workload's autoscaling metric (with `maxScale`
|
|
90
|
+
greater than `minScale`), Control Plane will add replicas in response — but the default `templates/rails.yml` pins
|
|
91
|
+
`minScale: 1`, `maxScale: 1`, so it holds a single replica until you configure autoscaling.
|
|
92
|
+
|
|
93
|
+
Worth monitoring:
|
|
94
|
+
|
|
95
|
+
- CPU utilization
|
|
96
|
+
- Request latency
|
|
97
|
+
- Replica count
|
|
98
|
+
- Container restarts
|
|
99
|
+
|
|
100
|
+
Consider configuring an alert for sustained CPU utilization above 80%. You can set this up with the same Grafana
|
|
101
|
+
alerting steps described under [RAM](#ram) above, substituting a CPU utilization query for the memory one.
|
|
102
|
+
|
|
62
103
|
## Remote IP
|
|
63
104
|
|
|
64
105
|
The actual remote IP of the workload container is in the 127.0.0.x network, so that will be the value of the
|
|
@@ -70,6 +111,9 @@ pick those up and automatically populate `request.remote_ip`.
|
|
|
70
111
|
|
|
71
112
|
So `REMOTE_ADDR` should not be used directly, only `request.remote_ip`.
|
|
72
113
|
|
|
114
|
+
> **Warning:** Do not use `REMOTE_ADDR` for authentication, rate limiting, auditing, or IP allowlists. Always use
|
|
115
|
+
> framework-specific mechanisms that understand proxy headers (such as Rails' `request.remote_ip`).
|
|
116
|
+
|
|
73
117
|
## CI
|
|
74
118
|
|
|
75
119
|
**Note:** Docker builds much slower on Apple Silicon, so try configuring CI to build the images when using Apple
|
|
@@ -82,17 +126,92 @@ CPLN_TOKEN=...
|
|
|
82
126
|
cpln profile create default --token ${CPLN_TOKEN}
|
|
83
127
|
```
|
|
84
128
|
|
|
129
|
+
The `CPLN_TOKEN=...` line above is illustrative. In CI, don't write the literal token into your workflow file — store it
|
|
130
|
+
in your provider's secret store and let CI inject it as the `CPLN_TOKEN` environment variable, which
|
|
131
|
+
`cpln profile create ... --token ${CPLN_TOKEN}` then reads. See [`examples/circleci.yml`](/examples/circleci.yml) for the
|
|
132
|
+
recommended pattern.
|
|
133
|
+
|
|
85
134
|
Also, log in to the Control Plane Docker repository if building and pushing an image.
|
|
86
135
|
|
|
87
136
|
```sh
|
|
88
137
|
cpln image docker-login
|
|
89
138
|
```
|
|
90
139
|
|
|
140
|
+
## Logs
|
|
141
|
+
|
|
142
|
+
`cpflow logs` is a lightweight live-tail command. When you hit `cpln`/`cpflow` line-count or response-size limits, use
|
|
143
|
+
Grafana Loki's [`logcli`](https://grafana.com/docs/loki/latest/query/logcli/) directly against the Control Plane logs
|
|
144
|
+
endpoint for larger historical exports.
|
|
145
|
+
|
|
146
|
+
Install `logcli` with Homebrew when available:
|
|
147
|
+
|
|
148
|
+
```sh
|
|
149
|
+
brew install logcli
|
|
150
|
+
```
|
|
151
|
+
|
|
152
|
+
If Homebrew reports that the formula is unavailable, use Grafana's tap:
|
|
153
|
+
|
|
154
|
+
```sh
|
|
155
|
+
brew tap grafana/grafana
|
|
156
|
+
brew install grafana/grafana/logcli
|
|
157
|
+
```
|
|
158
|
+
|
|
159
|
+
For Linux, CI, or other environments without Homebrew, see the [`logcli` installation
|
|
160
|
+
docs](https://grafana.com/docs/loki/latest/query/logcli/getting-started/#install-logcli) for binary downloads or source
|
|
161
|
+
builds.
|
|
162
|
+
|
|
163
|
+
Configure it with your Control Plane org and current profile token:
|
|
164
|
+
|
|
165
|
+
```sh
|
|
166
|
+
export LOKI_ADDR=https://logs.cpln.io/logs/org/YOUR_ORG # run `cpln org get` to find your org name
|
|
167
|
+
export LOKI_BEARER_TOKEN=$(cpln profile token)
|
|
168
|
+
```
|
|
169
|
+
|
|
170
|
+
`LOKI_BEARER_TOKEN` is a short-lived bearer credential (it typically expires after roughly 15–60 minutes). The
|
|
171
|
+
`$(cpln profile token)` capture above keeps the literal token out of shell history, but any later command that prints
|
|
172
|
+
it (`echo $LOKI_BEARER_TOKEN`, `env | grep LOKI`) will expose it; avoid those, don't commit the value to scripts, and
|
|
173
|
+
watch for it in CI logs. Rerun the token export if `logcli` returns a 401 or another authentication error.
|
|
174
|
+
|
|
175
|
+
Then query logs by label. A Control Plane app is a GVC, so set `gvc` to the app name and narrow by workload or other
|
|
176
|
+
labels as needed. The `--forward` flag returns results oldest-first (chronological), which is almost always what you
|
|
177
|
+
want for incident investigation or sequential reading; omit it to get the `logcli` default of newest-first:
|
|
178
|
+
|
|
179
|
+
```sh
|
|
180
|
+
logcli query '{gvc="my-app", workload="rails"}' --since 1h --limit 10000 --forward
|
|
181
|
+
```
|
|
182
|
+
|
|
183
|
+
For cleaner bulk exports, strip label metadata from each output line and redirect the output:
|
|
184
|
+
|
|
185
|
+
```sh
|
|
186
|
+
logcli query '{gvc="my-app", workload="rails"}' --since 24h --limit 50000 --no-labels --forward > rails.log
|
|
187
|
+
```
|
|
188
|
+
|
|
189
|
+
For historical incidents, use absolute UTC timestamps instead of a relative `--since` window:
|
|
190
|
+
|
|
191
|
+
```sh
|
|
192
|
+
logcli query '{gvc="my-app"}' \
|
|
193
|
+
--from="2026-05-27T00:00:00Z" \
|
|
194
|
+
--to="2026-05-27T06:00:00Z" \
|
|
195
|
+
--limit 50000 \
|
|
196
|
+
--no-labels \
|
|
197
|
+
--forward > incident.log
|
|
198
|
+
```
|
|
199
|
+
|
|
200
|
+
`logcli` silently truncates results once `--limit` is reached, so a partial export looks the same as a complete one.
|
|
201
|
+
To check for truncation, compare line count to the limit: `wc -l < incident.log` near `--limit` means the export was
|
|
202
|
+
likely cut off. Prefer narrowing the time window (and concatenating the sub-ranges) over raising `--limit`, since the
|
|
203
|
+
server-side cap may be lower than the flag value.
|
|
204
|
+
|
|
91
205
|
## Memcached
|
|
92
206
|
|
|
93
207
|
On the workload container for Memcached (using the `memcached:alpine` image), configure the command with the args
|
|
94
208
|
`-l 0.0.0.0`.
|
|
95
209
|
|
|
210
|
+
This makes Memcached listen on all network interfaces so other workloads in the GVC can reach it at
|
|
211
|
+
`memcached.APP_GVC.cpln.local`. The `memcached` image already defaults to all interfaces, but passing `-l 0.0.0.0`
|
|
212
|
+
explicitly keeps the intent clear and guards against the listen address being restricted by a future base-image or
|
|
213
|
+
config change.
|
|
214
|
+
|
|
96
215
|
To do this:
|
|
97
216
|
|
|
98
217
|
1. Navigate to the workload container for Memcached
|
|
@@ -114,6 +233,9 @@ There's no need to unquiet the workers, as that will happen automatically after
|
|
|
114
233
|
cpflow run 'rails runner "Sidekiq::ProcessSet.new.each { |w| w.quiet! unless w[%q(hostname)].start_with?(%q(criticalworker.)) }"' -a my-app
|
|
115
234
|
```
|
|
116
235
|
|
|
236
|
+
> **Note:** This assumes critical workers share a consistent hostname prefix (the check matches `hostname`, not
|
|
237
|
+
> Sidekiq's `tag` attribute). If you use a custom naming convention, adjust the `start_with?` check accordingly.
|
|
238
|
+
|
|
117
239
|
### Setting Up a Pre Stop Hook
|
|
118
240
|
|
|
119
241
|
By setting up a pre stop hook in the lifecycle of the workload container for Sidekiq, which sends "QUIET" to the workers,
|
|
@@ -144,6 +266,130 @@ To do this:
|
|
|
144
266
|
|
|
145
267
|
To set up a liveness probe on port 7433, see: https://github.com/arturictus/sidekiq_alive
|
|
146
268
|
|
|
269
|
+
## Minimizing Review App Costs
|
|
270
|
+
|
|
271
|
+
Long-tail review apps — PRs that linger for days or weeks with little traffic — can drive up Control Plane spend if every
|
|
272
|
+
workload runs full-time. `cpflow` already provides several knobs to manage this without custom orchestration.
|
|
273
|
+
|
|
274
|
+
> **Note:** Scaling workloads to zero or stopping review apps does not reduce costs from external databases, managed
|
|
275
|
+
> Redis instances, object storage, or other third-party services. Those continue to bill independently of Control Plane
|
|
276
|
+
> workload state.
|
|
277
|
+
|
|
278
|
+
### Scale the Web Workload to Zero
|
|
279
|
+
|
|
280
|
+
`templates/rails.yml` ships with `type: standard`, `minScale: 1`, `maxScale: 1`. That's a safe default for production,
|
|
281
|
+
but for review apps where cold-start latency is acceptable you can switch the web workload to a serverless type that
|
|
282
|
+
scales to zero replicas when idle. Apply the snippet below to your project's `.controlplane/templates/rails.yml`, or
|
|
283
|
+
create a review-app-specific template (for example `rails-review.yml`) and list it under `setup_app_templates` for the
|
|
284
|
+
review-app entry in `.controlplane/controlplane.yml`.
|
|
285
|
+
|
|
286
|
+
```yaml
|
|
287
|
+
# Only `type` and `minScale` change from templates/rails.yml; `maxScale`, `capacityAI` and `timeoutSeconds`
|
|
288
|
+
# are shown for context so the full `defaultOptions` block reaches the destination intact.
|
|
289
|
+
# Update the relevant fields in your full templates/rails.yml (or a review-app-specific template); keep
|
|
290
|
+
# containers, firewallConfig, identityLink, and everything else from that file intact.
|
|
291
|
+
kind: workload
|
|
292
|
+
name: rails
|
|
293
|
+
spec:
|
|
294
|
+
type: serverless
|
|
295
|
+
defaultOptions:
|
|
296
|
+
autoscaling:
|
|
297
|
+
minScale: 0
|
|
298
|
+
maxScale: 1
|
|
299
|
+
capacityAI: false # keep your existing value
|
|
300
|
+
timeoutSeconds: 60 # keep your existing value
|
|
301
|
+
```
|
|
302
|
+
|
|
303
|
+
See [`templates/rails.yml`](/templates/rails.yml) for the full default — `containers`, `firewallConfig`,
|
|
304
|
+
`identityLink`, and the other required fields must be preserved when you copy the snippet above.
|
|
305
|
+
|
|
306
|
+
Control Plane spins the workload back up on the next request. Only `type: serverless` workloads support `minScale: 0`;
|
|
307
|
+
`type: standard` always keeps at least one replica running.
|
|
308
|
+
|
|
309
|
+
Tradeoff: the first request after a quiet period pays the cold-start cost (typically 15–60 seconds for a Rails
|
|
310
|
+
image, depending on app size and boot configuration). For review apps that's usually fine; for production it
|
|
311
|
+
usually isn't.
|
|
312
|
+
|
|
313
|
+
> **Note:** if you later suspend the app with `cpflow ps:stop`, Control Plane will not auto-wake it on the next
|
|
314
|
+
> request. Run `cpflow ps:start` explicitly first. See
|
|
315
|
+
> [Pause and Resume](#pause-and-resume-with-psstop--psstart).
|
|
316
|
+
|
|
317
|
+
### Delete or Pause Abandoned Apps with `cleanup-stale-apps`
|
|
318
|
+
|
|
319
|
+
For PRs that are clearly done — merged, closed, or untouched for weeks — deleting beats scaling. Set
|
|
320
|
+
`stale_app_image_deployed_days` in `.controlplane/controlplane.yml`:
|
|
321
|
+
|
|
322
|
+
```yaml
|
|
323
|
+
my-app-review:
|
|
324
|
+
match_if_app_name_starts_with: true
|
|
325
|
+
stale_app_image_deployed_days: 14
|
|
326
|
+
```
|
|
327
|
+
|
|
328
|
+
Pick a threshold that fits your review cycle — 7 days can catch PRs still in QA; teams with longer review cycles often
|
|
329
|
+
use 14–30 days.
|
|
330
|
+
|
|
331
|
+
> **How staleness is measured:** `stale_app_image_deployed_days` uses the Control Plane image resource's `created`
|
|
332
|
+
> timestamp, typically when the image was pushed to Control Plane's registry. If no matching image exists, it falls back
|
|
333
|
+
> to the GVC's `created` timestamp. It does not consider last traffic or last PR comment.
|
|
334
|
+
> The same stale-app scan applies to both delete and stop modes below.
|
|
335
|
+
|
|
336
|
+
Then run in delete mode:
|
|
337
|
+
|
|
338
|
+
```sh
|
|
339
|
+
cpflow cleanup-stale-apps -a my-app-review --yes
|
|
340
|
+
```
|
|
341
|
+
|
|
342
|
+
The `--yes` flag skips the interactive confirmation prompt; keep it for CI jobs, or omit it when running manually and
|
|
343
|
+
you want to review the prompt. Because `match_if_app_name_starts_with: true` is set, `-a my-app-review` here matches
|
|
344
|
+
every app whose name starts with that prefix — by contrast, the `cpflow ps:stop -a my-app-review-123` examples below
|
|
345
|
+
target a single concrete app name.
|
|
346
|
+
|
|
347
|
+
This deletes the GVC, workloads, volumesets, and images for any review app whose latest matching image, or GVC when no
|
|
348
|
+
matching image exists, is older than the threshold. It also unbinds the app identity from the secrets policy and any
|
|
349
|
+
configured `shared_secret_grants` policies when those bindings exist. Wire it into a nightly CI cron — see
|
|
350
|
+
[CI Automation — Generated Workflow Behavior](/docs/ci-automation.md#generated-workflow-behavior) for the
|
|
351
|
+
`cpflow-cleanup-stale-review-apps.yml` workflow, which runs in delete mode by default; customize the workflow
|
|
352
|
+
to pass `--mode=stop` if you prefer reversible pausing in CI.
|
|
353
|
+
|
|
354
|
+
For reversible idle handling under the same stale-app scan, use stop mode instead:
|
|
355
|
+
|
|
356
|
+
```sh
|
|
357
|
+
cpflow cleanup-stale-apps -a my-app-review --mode=stop --yes
|
|
358
|
+
```
|
|
359
|
+
|
|
360
|
+
This uses the same staleness threshold, but runs `cpflow ps:stop` for each stale app instead of deleting the GVC,
|
|
361
|
+
volumesets, or images. Resume an app later with `cpflow ps:start -a $APP_NAME`. `cpflow ps:stop` only suspends
|
|
362
|
+
workloads listed under `app_workloads` / `additional_workloads` in `.controlplane/controlplane.yml`; workloads
|
|
363
|
+
created outside that config (for example through the Control Plane UI) are left alone — see
|
|
364
|
+
[Pause and Resume](#pause-and-resume-with-psstop--psstart) for details.
|
|
365
|
+
|
|
366
|
+
### Pause and Resume with `ps:stop` / `ps:start`
|
|
367
|
+
|
|
368
|
+
For review apps you want to keep but pause — for example, a long-running QA branch a tester will come back to — suspend
|
|
369
|
+
all workloads with:
|
|
370
|
+
|
|
371
|
+
```sh
|
|
372
|
+
cpflow ps:stop -a my-app-review-123
|
|
373
|
+
```
|
|
374
|
+
|
|
375
|
+
This sets `defaultOptions.suspend: true` on every workload listed under `app_workloads` or `additional_workloads` in
|
|
376
|
+
`.controlplane/controlplane.yml`. Workloads created outside that config (for example through the Control Plane UI) are
|
|
377
|
+
left alone. Resume with:
|
|
378
|
+
|
|
379
|
+
```sh
|
|
380
|
+
cpflow ps:start -a my-app-review-123
|
|
381
|
+
```
|
|
382
|
+
|
|
383
|
+
No re-deploy is needed; the workloads come back with the same images they had before.
|
|
384
|
+
|
|
385
|
+
> **Note:** `ps:stop` overrides serverless auto-wake. If the web workload is already serverless (`minScale: 0`),
|
|
386
|
+
> suspending it sets `defaultOptions.suspend: true`, and Control Plane will not bring it back on the next request —
|
|
387
|
+
> `ps:start` must be run explicitly first.
|
|
388
|
+
>
|
|
389
|
+
> **Note:** Sidekiq, Postgres, Redis, and Memcached templates default to `type: standard` and `minScale: 1`, so they
|
|
390
|
+
> keep running while only the web tier sleeps. `cpflow ps:stop -a $APP_NAME` suspends every configured workload, web
|
|
391
|
+
> included, and `cleanup-stale-apps --mode=stop` applies the same pause behavior to stale review apps.
|
|
392
|
+
|
|
147
393
|
## Useful Links
|
|
148
394
|
|
|
149
395
|
- For best practices for the app's Dockerfile, see: https://lipanski.com/posts/dockerfile-ruby-best-practices
|
data/examples/controlplane.yml
CHANGED
|
@@ -58,6 +58,14 @@ aliases:
|
|
|
58
58
|
# it would be 'my-app-review-secrets-policy'
|
|
59
59
|
secrets_policy_name: my-secrets-policy
|
|
60
60
|
|
|
61
|
+
# Optional: grant every app identity from this config entry access to
|
|
62
|
+
# existing shared org-level secret policies. Templates can reference the
|
|
63
|
+
# shared secret with {{SHARED_SECRET_DATABASE}}.
|
|
64
|
+
# shared_secret_grants:
|
|
65
|
+
# - name: database
|
|
66
|
+
# secret_name: my-shared-database-secrets
|
|
67
|
+
# policy_name: my-shared-database-secrets-policy
|
|
68
|
+
|
|
61
69
|
# Configure the workload name used as a template for one-off scripts, like a Heroku one-off dyno.
|
|
62
70
|
one_off_workload: rails
|
|
63
71
|
|
|
@@ -32,7 +32,7 @@ module Command
|
|
|
32
32
|
<<~PROMPT
|
|
33
33
|
Set up Control Plane GitHub Flow for this repo. Start with `cpflow github-flow-readiness` and stop on any reported blockers. The repo must be deployable from a clean clone: published package versions, complete runtime scaffold, and a production Dockerfile that can build the app. If any package version is unpublished, inaccessible from CI, or requires credentials that are not already modeled in the repo or GitHub settings, stop and report the blocker instead of generating workflow files. If the repo is a legacy sample pinned to an obsolete Ruby or Bundler toolchain, if it does not even have a production Dockerfile yet, or if it is a monorepo without an already-decided single app boundary for this flow, stop and report that as a prerequisite instead of forcing the rollout.
|
|
34
34
|
|
|
35
|
-
If `.controlplane/` is missing, run `cpflow generate`. Treat the generated app names as the repo-name default (`#{inferred_app_prefix}`) and rename them only if the project needs a different prefix. Then run `cpflow generate-github-actions` (or `cpflow generate-github-actions --staging-branch BRANCH` when staging should deploy from a branch other than `main`/`master`), keep review apps opt-in via `+review-app-deploy`, make sure any `STAGING_APP_BRANCH` repository variable is also present in the generated staging workflow's `on.push.branches` filter, and list the GitHub secrets and variables that must be configured. Do not hand-edit duplicated upstream refs into the generated wrappers: the only downstream Control Plane Flow pin should be the reusable workflow `uses: ...@vX.Y.Z` value generated from the installed `cpflow` gem version, and upstream workflows load their matching shared actions automatically. When bumping the `cpflow` gem in a downstream repo, run `cpflow update-github-actions` (or `bundle exec cpflow update-github-actions`) and validate with `bin/test-cpflow-github-flow` in the same PR so the checked-in wrappers move to the matching release tag. Keep the standard path simple: review apps require only `CPLN_TOKEN_STAGING` when the generated review app config can be inferred. Document the one-time Control Plane bootstrap command for persistent staging and production apps with `cpflow setup-app --skip-post-creation-hook`; for existing apps or later template updates, document `cpflow apply-template` and the need for the app identity to have `reveal` on the app secret policy. Do not imply the staging deploy or promotion workflows create those persistent GVCs. For production promotion, document a protected `production` GitHub Environment with required reviewers, prevent self-review, and `CPLN_TOKEN_PRODUCTION` stored as an environment secret, not as a repository or organization secret.
|
|
35
|
+
If `.controlplane/` is missing, run `cpflow generate`. Treat the generated app names as the repo-name default (`#{inferred_app_prefix}`) and rename them only if the project needs a different prefix. Then run `cpflow generate-github-actions` (or `cpflow generate-github-actions --staging-branch BRANCH` when staging should deploy from a branch other than `main`/`master`), keep review apps opt-in via `+review-app-deploy`, make sure any `STAGING_APP_BRANCH` repository variable is also present in the generated staging workflow's `on.push.branches` filter, and list the GitHub secrets and variables that must be configured. Do not hand-edit duplicated upstream refs into the generated wrappers: the only downstream Control Plane Flow pin should be the reusable workflow `uses: ...@vX.Y.Z` value generated from the installed `cpflow` gem version, and upstream workflows load their matching shared actions automatically. When bumping the `cpflow` gem in a downstream repo, run `cpflow update-github-actions` (or `bundle exec cpflow update-github-actions`) and validate with `bin/test-cpflow-github-flow` in the same PR so the checked-in wrappers move to the matching release tag. Keep the standard path simple: review apps require only `CPLN_TOKEN_STAGING` when the generated review app config can be inferred. For shared review-app resources such as one staging database, use `shared_secret_grants` and `{{SHARED_SECRET_DATABASE}}` placeholders instead of hardcoding the base app secret name; this keeps review-app policy binding and cleanup automatic while avoiding per-PR database cost. Document the one-time Control Plane bootstrap command for persistent staging and production apps with `cpflow setup-app --skip-post-creation-hook`; for existing apps or later template updates, document `cpflow apply-template` and the need for the app identity to have `reveal` on the app secret policy. Do not imply the staging deploy or promotion workflows create those persistent GVCs. For production promotion, document a protected `production` GitHub Environment with required reviewers, prevent self-review, and `CPLN_TOKEN_PRODUCTION` stored as an environment secret, not as a repository or organization secret.
|
|
36
36
|
|
|
37
37
|
Keep Node available in the final image if asset compilation or SSR depends on ExecJS, Yarn, `pnpm`, or npm after the main install layer. Make sure the generated Dockerfile uses a Ruby base image compatible with the app's declared Ruby requirement. Preserve repo-defined frontend build hooks: if `config/shakapacker.yml` defines a `precompile_hook`, or React on Rails enables `config.auto_load_bundle = true`, confirm the generated Dockerfile runs that codegen step before `rails assets:precompile`. If `config/database.yml` shows SQLite in production, confirm that the generated scaffold uses persistent `db` and `storage` volumes plus a release script that runs `rails db:prepare`; otherwise keep the default Postgres workload. If the public workload is not named `rails`, set `PRIMARY_WORKLOAD` or adjust the generated workflows. Inspect the Dockerfile and package sources for private GitHub dependencies or `RUN --mount=type=ssh`; if present, wire `DOCKER_BUILD_SSH_KEY`, optionally set `DOCKER_BUILD_SSH_KNOWN_HOSTS` for non-GitHub SSH hosts, and keep `DOCKER_BUILD_EXTRA_ARGS` to newline-delimited single tokens such as `--build-arg=FOO=bar`.
|
|
38
38
|
|
|
@@ -29,6 +29,9 @@ module Command
|
|
|
29
29
|
{{APP_IMAGE_LINK}} - full link for latest app image, ready to be used for the value of `containers[].image` in the templates
|
|
30
30
|
{{APP_IDENTITY}} - default identity
|
|
31
31
|
{{APP_IDENTITY_LINK}} - full link for identity, ready to be used for the value of `identityLink` in the templates
|
|
32
|
+
{{APP_SECRETS}} - app secret dictionary name
|
|
33
|
+
{{APP_SECRETS_POLICY}} - app secret policy name
|
|
34
|
+
{{SHARED_SECRET_<NAME>}} - shared secret dictionary name from `shared_secret_grants`
|
|
32
35
|
```
|
|
33
36
|
DESC
|
|
34
37
|
EXAMPLES = <<~EX
|
data/lib/command/base.rb
CHANGED
|
@@ -587,6 +587,75 @@ module Command
|
|
|
587
587
|
@cp ||= Controlplane.new(config)
|
|
588
588
|
end
|
|
589
589
|
|
|
590
|
+
def bind_shared_secret_policy_grants(grant_policy_pairs)
|
|
591
|
+
grant_policy_pairs.each do |grant, policy|
|
|
592
|
+
bind_shared_secret_policy_grant(grant, policy)
|
|
593
|
+
end
|
|
594
|
+
end
|
|
595
|
+
|
|
596
|
+
def resolve_shared_secret_policy_grants
|
|
597
|
+
config.shared_secret_grants.map do |grant|
|
|
598
|
+
[grant, resolve_shared_secret_policy_grant(grant)]
|
|
599
|
+
end
|
|
600
|
+
end
|
|
601
|
+
|
|
602
|
+
def resolve_shared_secret_policy_grant(grant)
|
|
603
|
+
policy = cp.fetch_policy(grant.fetch(:policy_name))
|
|
604
|
+
raise shared_secret_policy_missing_message(grant) if policy.nil?
|
|
605
|
+
|
|
606
|
+
ensure_shared_secret_policy_targets_secret!(grant, policy)
|
|
607
|
+
policy
|
|
608
|
+
end
|
|
609
|
+
|
|
610
|
+
def bind_shared_secret_policy_grant(grant, policy)
|
|
611
|
+
policy_name = grant.fetch(:policy_name)
|
|
612
|
+
return if identity_bound_to_policy_with_reveal?(policy)
|
|
613
|
+
|
|
614
|
+
step("Binding identity '#{config.identity}' to shared secret policy '#{policy_name}'") do
|
|
615
|
+
cp.bind_identity_to_policy(config.identity_link, policy_name)
|
|
616
|
+
end
|
|
617
|
+
end
|
|
618
|
+
|
|
619
|
+
def ensure_shared_secret_policy_targets_secret!(grant, policy)
|
|
620
|
+
return if shared_secret_policy_targets_secret?(grant, policy)
|
|
621
|
+
|
|
622
|
+
raise "Shared secret policy '#{grant.fetch(:policy_name)}' for shared_secret_grants entry " \
|
|
623
|
+
"'#{grant.fetch(:name)}' must target only secret '#{grant.fetch(:secret_name)}'."
|
|
624
|
+
end
|
|
625
|
+
|
|
626
|
+
def shared_secret_policy_targets_secret?(grant, policy)
|
|
627
|
+
target_links = Array(policy["targetLinks"])
|
|
628
|
+
|
|
629
|
+
policy["targetKind"] == "secret" &&
|
|
630
|
+
target_links.one? &&
|
|
631
|
+
shared_secret_policy_target_links(grant).include?(target_links.first)
|
|
632
|
+
end
|
|
633
|
+
|
|
634
|
+
def shared_secret_policy_target_links(grant)
|
|
635
|
+
secret_name = grant.fetch(:secret_name)
|
|
636
|
+
[
|
|
637
|
+
"//secret/#{secret_name}",
|
|
638
|
+
"/org/#{config.org}/secret/#{secret_name}"
|
|
639
|
+
]
|
|
640
|
+
end
|
|
641
|
+
|
|
642
|
+
def identity_bound_to_policy_with_reveal?(policy)
|
|
643
|
+
identity_policy_permissions(policy).include?("reveal")
|
|
644
|
+
end
|
|
645
|
+
|
|
646
|
+
def identity_policy_permissions(policy)
|
|
647
|
+
Array(policy["bindings"]).flat_map do |binding|
|
|
648
|
+
next [] unless Array(binding["principalLinks"]).include?(config.identity_link)
|
|
649
|
+
|
|
650
|
+
Array(binding["permissions"])
|
|
651
|
+
end.uniq
|
|
652
|
+
end
|
|
653
|
+
|
|
654
|
+
def shared_secret_policy_missing_message(grant)
|
|
655
|
+
"Shared secret policy '#{grant.fetch(:policy_name)}' for shared_secret_grants entry " \
|
|
656
|
+
"'#{grant.fetch(:name)}' does not exist. Create the policy or remove the shared secret grant."
|
|
657
|
+
end
|
|
658
|
+
|
|
590
659
|
def ensure_docker_running!
|
|
591
660
|
result = Shell.cmd("docker", "version", capture_stderr: true)
|
|
592
661
|
return if result[:success]
|
|
@@ -23,7 +23,7 @@ module Command
|
|
|
23
23
|
DESCRIPTION = "Deletes or stops stale apps based on the latest image's creation date"
|
|
24
24
|
LONG_DESCRIPTION = <<~DESC
|
|
25
25
|
- Acts on stale apps based on the creation date of the latest image, or the GVC if no images exist
|
|
26
|
-
- With `--mode=delete` (default): deletes the whole app (GVC with all workloads, all volumesets and all images), and unbinds the app from the secrets policy as long as both the identity and
|
|
26
|
+
- With `--mode=delete` (default): deletes the whole app (GVC with all workloads, all volumesets and all images), and unbinds the app from the secrets policy and any configured `shared_secret_grants` policies as long as both the identity and each policy exist (and are bound)
|
|
27
27
|
- With `--mode=stop`: suspends all workloads via `cpflow ps:stop` — no GVC, volumeset, or image is removed; resume with `cpflow ps:start`
|
|
28
28
|
- `--mode=stop` only suspends workloads listed in `app_workloads` + `additional_workloads`; workloads present in the live GVC but missing from the config are skipped silently
|
|
29
29
|
- `--mode=stop` returns once each workload is marked suspended; it does not wait for the workload to reach a not-ready state
|
data/lib/command/delete.rb
CHANGED
|
@@ -12,7 +12,8 @@ module Command
|
|
|
12
12
|
DESCRIPTION = "Deletes the whole app (GVC with all workloads, all volumesets and all images) or a specific workload"
|
|
13
13
|
LONG_DESCRIPTION = <<~DESC
|
|
14
14
|
- Deletes the whole app (GVC with all workloads, all volumesets and all images) or a specific workload
|
|
15
|
-
- Also unbinds the app from the secrets policy, as long as both the identity and
|
|
15
|
+
- Also unbinds the app from the secrets policy and any configured `shared_secret_grants` policies, as long as both the identity and each policy exist (and are bound)
|
|
16
|
+
- For the app-specific secrets policy, removes every permission held by the app identity; for `shared_secret_grants`, removes only `reveal`
|
|
16
17
|
- Will ask for explicit user confirmation
|
|
17
18
|
- Runs a pre-deletion hook before the app is deleted if `hooks.pre_deletion` is specified in the `.controlplane/controlplane.yml` file
|
|
18
19
|
- If the hook exits with a non-zero code, the command will stop executing and also exit with a non-zero code
|
|
@@ -55,8 +56,11 @@ module Command
|
|
|
55
56
|
check_images
|
|
56
57
|
return unless confirm_delete(config.app)
|
|
57
58
|
|
|
59
|
+
# Snapshot policy state before the pre-deletion hook, so config errors surface
|
|
60
|
+
# before hook side effects while the hook can still use bound shared secrets.
|
|
61
|
+
policy_unbinds = secret_policy_unbinds
|
|
58
62
|
run_pre_deletion_hook unless config.options[:skip_pre_deletion_hook]
|
|
59
|
-
unbind_identity_from_policy
|
|
63
|
+
unbind_identity_from_policy(policy_unbinds)
|
|
60
64
|
delete_volumesets
|
|
61
65
|
delete_gvc
|
|
62
66
|
delete_images
|
|
@@ -125,19 +129,90 @@ module Command
|
|
|
125
129
|
end
|
|
126
130
|
end
|
|
127
131
|
|
|
128
|
-
def unbind_identity_from_policy
|
|
129
|
-
|
|
132
|
+
def unbind_identity_from_policy(policy_unbinds)
|
|
133
|
+
policy_unbinds.each do |policy_unbind|
|
|
134
|
+
unbind_identity_from_secret_policy(policy_unbind)
|
|
135
|
+
end
|
|
136
|
+
end
|
|
137
|
+
|
|
138
|
+
def secret_policy_unbinds
|
|
139
|
+
return [] if cp.fetch_identity(config.identity).nil?
|
|
140
|
+
|
|
141
|
+
[
|
|
142
|
+
app_secret_policy_unbind,
|
|
143
|
+
*shared_secret_policy_unbinds
|
|
144
|
+
].compact
|
|
145
|
+
end
|
|
146
|
+
|
|
147
|
+
def app_secret_policy_unbind
|
|
148
|
+
policy_unbind_for(
|
|
149
|
+
config.secrets_policy,
|
|
150
|
+
"Unbinding identity from policy for app '#{config.app}'"
|
|
151
|
+
)
|
|
152
|
+
end
|
|
153
|
+
|
|
154
|
+
def shared_secret_policy_unbinds
|
|
155
|
+
config.shared_secret_grants.filter_map do |grant|
|
|
156
|
+
shared_secret_policy_unbind(grant)
|
|
157
|
+
end
|
|
158
|
+
end
|
|
130
159
|
|
|
131
|
-
|
|
160
|
+
def shared_secret_policy_unbind(grant)
|
|
161
|
+
policy_name = grant.fetch(:policy_name)
|
|
162
|
+
policy = cp.fetch_policy(policy_name)
|
|
132
163
|
return if policy.nil?
|
|
133
164
|
|
|
134
|
-
|
|
135
|
-
|
|
165
|
+
# cpflow only grants reveal on shared policies, so that is the only
|
|
166
|
+
# permission we remove from shared grants during cleanup.
|
|
167
|
+
return unless identity_bound_to_policy_with_reveal?(policy)
|
|
168
|
+
|
|
169
|
+
unless shared_secret_policy_targets_secret?(grant, policy)
|
|
170
|
+
# A drifted shared policy should not block teardown. Remove the app
|
|
171
|
+
# identity's reveal binding anyway so reusing the app name cannot inherit access.
|
|
172
|
+
warn_shared_secret_policy_target_mismatch(grant, policy_name)
|
|
136
173
|
end
|
|
137
|
-
return unless is_bound
|
|
138
174
|
|
|
139
|
-
|
|
140
|
-
|
|
175
|
+
shared_secret_policy_unbind_data(policy_name)
|
|
176
|
+
end
|
|
177
|
+
|
|
178
|
+
def shared_secret_policy_unbind_data(policy_name)
|
|
179
|
+
{
|
|
180
|
+
policy_name: policy_name,
|
|
181
|
+
message: "Unbinding identity from shared secret policy '#{policy_name}' for app '#{config.app}'",
|
|
182
|
+
permissions: ["reveal"]
|
|
183
|
+
}
|
|
184
|
+
end
|
|
185
|
+
|
|
186
|
+
def warn_shared_secret_policy_target_mismatch(grant, policy_name)
|
|
187
|
+
progress.puts(
|
|
188
|
+
"Warning: unbinding identity from shared secret policy '#{policy_name}' even though it does not " \
|
|
189
|
+
"target configured secret '#{grant.fetch(:secret_name)}'."
|
|
190
|
+
)
|
|
191
|
+
end
|
|
192
|
+
|
|
193
|
+
def policy_unbind_for(policy_name, message)
|
|
194
|
+
policy = cp.fetch_policy(policy_name)
|
|
195
|
+
return if policy.nil?
|
|
196
|
+
|
|
197
|
+
permissions = identity_policy_permissions(policy)
|
|
198
|
+
return if permissions.empty?
|
|
199
|
+
|
|
200
|
+
{
|
|
201
|
+
policy_name: policy_name,
|
|
202
|
+
message: message,
|
|
203
|
+
permissions: permissions
|
|
204
|
+
}
|
|
205
|
+
end
|
|
206
|
+
|
|
207
|
+
def unbind_identity_from_secret_policy(policy_unbind)
|
|
208
|
+
policy_unbind.fetch(:permissions).each do |permission|
|
|
209
|
+
step("#{policy_unbind.fetch(:message)} (#{permission})") do
|
|
210
|
+
cp.unbind_identity_from_policy(
|
|
211
|
+
config.identity_link,
|
|
212
|
+
policy_unbind.fetch(:policy_name),
|
|
213
|
+
permission: permission
|
|
214
|
+
)
|
|
215
|
+
end
|
|
141
216
|
end
|
|
142
217
|
end
|
|
143
218
|
|