cpflow 5.0.4 → 5.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -29,6 +29,55 @@ You can do this during the initial app setup, like this:
29
29
  6. Find the created secret (it will be in the `$APP_PREFIX-secrets` format) and add the secret env vars there
30
30
  7. Use `cpln://secret/...` in the app to access the secret env vars (e.g., `cpln://secret/$APP_PREFIX-secrets.SOME_VAR`)
31
31
 
32
+ ## Shared Secrets for Review Apps
33
+
34
+ Review apps often need access to a shared staging resource, such as one staging PostgreSQL workload or managed database.
35
+ Creating a database per pull request is expensive and slow, so you can create one shared org-level secret and policy,
36
+ then let each temporary review-app identity reveal that shared secret.
37
+
38
+ Create the shared dictionary secret and policy once in the staging org. The policy must target exactly the shared secret:
39
+
40
+ ```yaml
41
+ kind: policy
42
+ name: my-app-review-database-secrets-policy
43
+ targetKind: secret
44
+ targetLinks:
45
+ - //secret/my-app-review-database-secrets
46
+ ```
47
+
48
+ Then declare the grant in the review app entry in `.controlplane/controlplane.yml`:
49
+
50
+ ```yaml
51
+ apps:
52
+ my-app-review:
53
+ match_if_app_name_starts_with: true
54
+ shared_secret_grants:
55
+ - name: database
56
+ secret_name: my-app-review-database-secrets
57
+ policy_name: my-app-review-database-secrets-policy
58
+ ```
59
+
60
+ Use the generated placeholder in templates instead of hardcoding the secret name:
61
+
62
+ ```yaml
63
+ env:
64
+ - name: DATABASE_URL
65
+ value: cpln://secret/{{SHARED_SECRET_DATABASE}}.DATABASE_URL
66
+ ```
67
+
68
+ `name` must be lower snake case. It becomes `{{SHARED_SECRET_<NAME>}}`, uppercased, in templates. `secret_name`
69
+ and `policy_name` must be Control Plane resource names: lowercase letters, numbers, and dashes only, starting and ending
70
+ with a letter or number.
71
+
72
+ `cpflow setup-app` still creates the per-app secret and policy for app-specific values, and also binds the app identity
73
+ to every configured shared policy. `cpflow deploy-image` repairs missing shared policy bindings before workloads are
74
+ updated, which helps existing review apps recover after the config is added. `cpflow delete` and `cpflow cleanup-stale-apps`
75
+ remove those shared policy bindings when a review app is deleted.
76
+
77
+ For shared databases, keep runtime data isolated by using a per-review-app database name, schema, or tenant key. A common
78
+ pattern is to keep the host, user, and password in the shared secret, then have `hooks.post_creation` create the PR-specific
79
+ database/schema and `hooks.pre_deletion` drop it.
80
+
32
81
  Here are the manual steps for reference. We recommend that you follow the steps above:
33
82
 
34
83
  1. In the upper left of the Control Plane console, "Manage Org" menu, click on "Secrets"
data/docs/tips.md CHANGED
@@ -1,16 +1,23 @@
1
1
  # Tips
2
2
 
3
3
  1. [GVCs vs. Orgs](#gvcs-vs-orgs)
4
- 2. [RAM](#ram)
5
- 3. [Remote IP](#remote-ip)
6
- 4. [Secrets and ENV Values](/docs/secrets-and-env-values.md)
7
- 5. [CI](#ci)
8
- 6. [Memcached](#memcached)
9
- 7. [Sidekiq](#sidekiq)
10
- - [Quieting Non-Critical Workers During Deployments](#quieting-non-critical-workers-during-deployments)
11
- - [Setting Up a Pre Stop Hook](#setting-up-a-pre-stop-hook)
12
- - [Setting Up a Liveness Probe](#setting-up-a-liveness-probe)
13
- 8. [Useful Links](#useful-links)
4
+ 2. [Heroku Mappings](#heroku-mappings)
5
+ 3. [RAM](#ram)
6
+ 4. [CPU](#cpu)
7
+ 5. [Remote IP](#remote-ip)
8
+ 6. [Secrets and ENV Values](/docs/secrets-and-env-values.md)
9
+ 7. [CI](#ci)
10
+ 8. [Logs](#logs)
11
+ 9. [Memcached](#memcached)
12
+ 10. [Sidekiq](#sidekiq)
13
+ - [Quieting Non-Critical Workers During Deployments](#quieting-non-critical-workers-during-deployments)
14
+ - [Setting Up a Pre Stop Hook](#setting-up-a-pre-stop-hook)
15
+ - [Setting Up a Liveness Probe](#setting-up-a-liveness-probe)
16
+ 11. [Minimizing Review App Costs](#minimizing-review-app-costs)
17
+ - [Scale the Web Workload to Zero](#scale-the-web-workload-to-zero)
18
+ - [Delete or Pause Abandoned Apps with `cleanup-stale-apps`](#delete-or-pause-abandoned-apps-with-cleanup-stale-apps)
19
+ - [Pause and Resume with `ps:stop` / `ps:start`](#pause-and-resume-with-psstop--psstart)
20
+ 12. [Useful Links](#useful-links)
14
21
 
15
22
  ## GVCs vs. Orgs
16
23
 
@@ -20,6 +27,23 @@
20
27
  - You can have different images within a GVC and even within a workload. This flexibility is one of the key differences
21
28
  compared to Heroku apps.
22
29
 
30
+ ## Heroku Mappings
31
+
32
+ If you're coming from Heroku, these concepts map roughly as follows:
33
+
34
+ | Heroku | Control Plane |
35
+ | ---------------- | ----------------------------------- |
36
+ | App | GVC |
37
+ | Dyno | Replica |
38
+ | Procfile Process | Workload |
39
+ | Config Var | Secret / Environment Variable |
40
+ | Add-on | Managed Service or External Service |
41
+ | Release Phase | Deployment Workflow |
42
+
43
+ These are conceptual equivalents rather than exact matches — see [GVCs vs. Orgs](#gvcs-vs-orgs) above for one key
44
+ difference. For a mapping of Heroku _CLI commands_ to `cpflow`/`cpln`, see
45
+ [Mapping of Heroku Commands](/README.md#mapping-of-heroku-commands-to-cpflow-and-cpln).
46
+
23
47
  ## RAM
24
48
 
25
49
  Any workload replica that reaches the max memory is terminated and restarted. You can configure alerts for workload
@@ -59,6 +83,23 @@ The steps for configuring an alert for workload restarts are almost identical, b
59
83
 
60
84
  For more information on Grafana alerts, see: https://grafana.com/docs/grafana/latest/alerting/
61
85
 
86
+ ## CPU
87
+
88
+ Control Plane workloads can be configured with CPU reservations and limits. If a workload consistently operates near its
89
+ CPU limit, request latency may increase. If CPU is configured as the workload's autoscaling metric (with `maxScale`
90
+ greater than `minScale`), Control Plane will add replicas in response — but the default `templates/rails.yml` pins
91
+ `minScale: 1`, `maxScale: 1`, so it holds a single replica until you configure autoscaling.
92
+
93
+ Worth monitoring:
94
+
95
+ - CPU utilization
96
+ - Request latency
97
+ - Replica count
98
+ - Container restarts
99
+
100
+ Consider configuring an alert for sustained CPU utilization above 80%. You can set this up with the same Grafana
101
+ alerting steps described under [RAM](#ram) above, substituting a CPU utilization query for the memory one.
102
+
62
103
  ## Remote IP
63
104
 
64
105
  The actual remote IP of the workload container is in the 127.0.0.x network, so that will be the value of the
@@ -70,6 +111,9 @@ pick those up and automatically populate `request.remote_ip`.
70
111
 
71
112
  So `REMOTE_ADDR` should not be used directly, only `request.remote_ip`.
72
113
 
114
+ > **Warning:** Do not use `REMOTE_ADDR` for authentication, rate limiting, auditing, or IP allowlists. Always use
115
+ > framework-specific mechanisms that understand proxy headers (such as Rails' `request.remote_ip`).
116
+
73
117
  ## CI
74
118
 
75
119
  **Note:** Docker builds much slower on Apple Silicon, so try configuring CI to build the images when using Apple
@@ -82,17 +126,92 @@ CPLN_TOKEN=...
82
126
  cpln profile create default --token ${CPLN_TOKEN}
83
127
  ```
84
128
 
129
+ The `CPLN_TOKEN=...` line above is illustrative. In CI, don't write the literal token into your workflow file — store it
130
+ in your provider's secret store and let CI inject it as the `CPLN_TOKEN` environment variable, which
131
+ `cpln profile create ... --token ${CPLN_TOKEN}` then reads. See [`examples/circleci.yml`](/examples/circleci.yml) for the
132
+ recommended pattern.
133
+
85
134
  Also, log in to the Control Plane Docker repository if building and pushing an image.
86
135
 
87
136
  ```sh
88
137
  cpln image docker-login
89
138
  ```
90
139
 
140
+ ## Logs
141
+
142
+ `cpflow logs` is a lightweight live-tail command. When you hit `cpln`/`cpflow` line-count or response-size limits, use
143
+ Grafana Loki's [`logcli`](https://grafana.com/docs/loki/latest/query/logcli/) directly against the Control Plane logs
144
+ endpoint for larger historical exports.
145
+
146
+ Install `logcli` with Homebrew when available:
147
+
148
+ ```sh
149
+ brew install logcli
150
+ ```
151
+
152
+ If Homebrew reports that the formula is unavailable, use Grafana's tap:
153
+
154
+ ```sh
155
+ brew tap grafana/grafana
156
+ brew install grafana/grafana/logcli
157
+ ```
158
+
159
+ For Linux, CI, or other environments without Homebrew, see the [`logcli` installation
160
+ docs](https://grafana.com/docs/loki/latest/query/logcli/getting-started/#install-logcli) for binary downloads or source
161
+ builds.
162
+
163
+ Configure it with your Control Plane org and current profile token:
164
+
165
+ ```sh
166
+ export LOKI_ADDR=https://logs.cpln.io/logs/org/YOUR_ORG # run `cpln org get` to find your org name
167
+ export LOKI_BEARER_TOKEN=$(cpln profile token)
168
+ ```
169
+
170
+ `LOKI_BEARER_TOKEN` is a short-lived bearer credential (it typically expires after roughly 15–60 minutes). The
171
+ `$(cpln profile token)` capture above keeps the literal token out of shell history, but any later command that prints
172
+ it (`echo $LOKI_BEARER_TOKEN`, `env | grep LOKI`) will expose it; avoid those, don't commit the value to scripts, and
173
+ watch for it in CI logs. Rerun the token export if `logcli` returns a 401 or another authentication error.
174
+
175
+ Then query logs by label. A Control Plane app is a GVC, so set `gvc` to the app name and narrow by workload or other
176
+ labels as needed. The `--forward` flag returns results oldest-first (chronological), which is almost always what you
177
+ want for incident investigation or sequential reading; omit it to get the `logcli` default of newest-first:
178
+
179
+ ```sh
180
+ logcli query '{gvc="my-app", workload="rails"}' --since 1h --limit 10000 --forward
181
+ ```
182
+
183
+ For cleaner bulk exports, strip label metadata from each output line and redirect the output:
184
+
185
+ ```sh
186
+ logcli query '{gvc="my-app", workload="rails"}' --since 24h --limit 50000 --no-labels --forward > rails.log
187
+ ```
188
+
189
+ For historical incidents, use absolute UTC timestamps instead of a relative `--since` window:
190
+
191
+ ```sh
192
+ logcli query '{gvc="my-app"}' \
193
+ --from="2026-05-27T00:00:00Z" \
194
+ --to="2026-05-27T06:00:00Z" \
195
+ --limit 50000 \
196
+ --no-labels \
197
+ --forward > incident.log
198
+ ```
199
+
200
+ `logcli` silently truncates results once `--limit` is reached, so a partial export looks the same as a complete one.
201
+ To check for truncation, compare line count to the limit: `wc -l < incident.log` near `--limit` means the export was
202
+ likely cut off. Prefer narrowing the time window (and concatenating the sub-ranges) over raising `--limit`, since the
203
+ server-side cap may be lower than the flag value.
204
+
91
205
  ## Memcached
92
206
 
93
207
  On the workload container for Memcached (using the `memcached:alpine` image), configure the command with the args
94
208
  `-l 0.0.0.0`.
95
209
 
210
+ This makes Memcached listen on all network interfaces so other workloads in the GVC can reach it at
211
+ `memcached.APP_GVC.cpln.local`. The `memcached` image already defaults to all interfaces, but passing `-l 0.0.0.0`
212
+ explicitly keeps the intent clear and guards against the listen address being restricted by a future base-image or
213
+ config change.
214
+
96
215
  To do this:
97
216
 
98
217
  1. Navigate to the workload container for Memcached
@@ -114,6 +233,9 @@ There's no need to unquiet the workers, as that will happen automatically after
114
233
  cpflow run 'rails runner "Sidekiq::ProcessSet.new.each { |w| w.quiet! unless w[%q(hostname)].start_with?(%q(criticalworker.)) }"' -a my-app
115
234
  ```
116
235
 
236
+ > **Note:** This assumes critical workers share a consistent hostname prefix (the check matches `hostname`, not
237
+ > Sidekiq's `tag` attribute). If you use a custom naming convention, adjust the `start_with?` check accordingly.
238
+
117
239
  ### Setting Up a Pre Stop Hook
118
240
 
119
241
  By setting up a pre stop hook in the lifecycle of the workload container for Sidekiq, which sends "QUIET" to the workers,
@@ -144,6 +266,130 @@ To do this:
144
266
 
145
267
  To set up a liveness probe on port 7433, see: https://github.com/arturictus/sidekiq_alive
146
268
 
269
+ ## Minimizing Review App Costs
270
+
271
+ Long-tail review apps — PRs that linger for days or weeks with little traffic — can drive up Control Plane spend if every
272
+ workload runs full-time. `cpflow` already provides several knobs to manage this without custom orchestration.
273
+
274
+ > **Note:** Scaling workloads to zero or stopping review apps does not reduce costs from external databases, managed
275
+ > Redis instances, object storage, or other third-party services. Those continue to bill independently of Control Plane
276
+ > workload state.
277
+
278
+ ### Scale the Web Workload to Zero
279
+
280
+ `templates/rails.yml` ships with `type: standard`, `minScale: 1`, `maxScale: 1`. That's a safe default for production,
281
+ but for review apps where cold-start latency is acceptable you can switch the web workload to a serverless type that
282
+ scales to zero replicas when idle. Apply the snippet below to your project's `.controlplane/templates/rails.yml`, or
283
+ create a review-app-specific template (for example `rails-review.yml`) and list it under `setup_app_templates` for the
284
+ review-app entry in `.controlplane/controlplane.yml`.
285
+
286
+ ```yaml
287
+ # Only `type` and `minScale` change from templates/rails.yml; `maxScale`, `capacityAI` and `timeoutSeconds`
288
+ # are shown for context so the full `defaultOptions` block reaches the destination intact.
289
+ # Update the relevant fields in your full templates/rails.yml (or a review-app-specific template); keep
290
+ # containers, firewallConfig, identityLink, and everything else from that file intact.
291
+ kind: workload
292
+ name: rails
293
+ spec:
294
+ type: serverless
295
+ defaultOptions:
296
+ autoscaling:
297
+ minScale: 0
298
+ maxScale: 1
299
+ capacityAI: false # keep your existing value
300
+ timeoutSeconds: 60 # keep your existing value
301
+ ```
302
+
303
+ See [`templates/rails.yml`](/templates/rails.yml) for the full default — `containers`, `firewallConfig`,
304
+ `identityLink`, and the other required fields must be preserved when you copy the snippet above.
305
+
306
+ Control Plane spins the workload back up on the next request. Only `type: serverless` workloads support `minScale: 0`;
307
+ `type: standard` always keeps at least one replica running.
308
+
309
+ Tradeoff: the first request after a quiet period pays the cold-start cost (typically 15–60 seconds for a Rails
310
+ image, depending on app size and boot configuration). For review apps that's usually fine; for production it
311
+ usually isn't.
312
+
313
+ > **Note:** if you later suspend the app with `cpflow ps:stop`, Control Plane will not auto-wake it on the next
314
+ > request. Run `cpflow ps:start` explicitly first. See
315
+ > [Pause and Resume](#pause-and-resume-with-psstop--psstart).
316
+
317
+ ### Delete or Pause Abandoned Apps with `cleanup-stale-apps`
318
+
319
+ For PRs that are clearly done — merged, closed, or untouched for weeks — deleting beats scaling. Set
320
+ `stale_app_image_deployed_days` in `.controlplane/controlplane.yml`:
321
+
322
+ ```yaml
323
+ my-app-review:
324
+ match_if_app_name_starts_with: true
325
+ stale_app_image_deployed_days: 14
326
+ ```
327
+
328
+ Pick a threshold that fits your review cycle — 7 days can catch PRs still in QA; teams with longer review cycles often
329
+ use 14–30 days.
330
+
331
+ > **How staleness is measured:** `stale_app_image_deployed_days` uses the Control Plane image resource's `created`
332
+ > timestamp, typically when the image was pushed to Control Plane's registry. If no matching image exists, it falls back
333
+ > to the GVC's `created` timestamp. It does not consider last traffic or last PR comment.
334
+ > The same stale-app scan applies to both delete and stop modes below.
335
+
336
+ Then run in delete mode:
337
+
338
+ ```sh
339
+ cpflow cleanup-stale-apps -a my-app-review --yes
340
+ ```
341
+
342
+ The `--yes` flag skips the interactive confirmation prompt; keep it for CI jobs, or omit it when running manually and
343
+ you want to review the prompt. Because `match_if_app_name_starts_with: true` is set, `-a my-app-review` here matches
344
+ every app whose name starts with that prefix — by contrast, the `cpflow ps:stop -a my-app-review-123` examples below
345
+ target a single concrete app name.
346
+
347
+ This deletes the GVC, workloads, volumesets, and images for any review app whose latest matching image, or GVC when no
348
+ matching image exists, is older than the threshold. It also unbinds the app identity from the secrets policy and any
349
+ configured `shared_secret_grants` policies when those bindings exist. Wire it into a nightly CI cron — see
350
+ [CI Automation — Generated Workflow Behavior](/docs/ci-automation.md#generated-workflow-behavior) for the
351
+ `cpflow-cleanup-stale-review-apps.yml` workflow, which runs in delete mode by default; customize the workflow
352
+ to pass `--mode=stop` if you prefer reversible pausing in CI.
353
+
354
+ For reversible idle handling under the same stale-app scan, use stop mode instead:
355
+
356
+ ```sh
357
+ cpflow cleanup-stale-apps -a my-app-review --mode=stop --yes
358
+ ```
359
+
360
+ This uses the same staleness threshold, but runs `cpflow ps:stop` for each stale app instead of deleting the GVC,
361
+ volumesets, or images. Resume an app later with `cpflow ps:start -a $APP_NAME`. `cpflow ps:stop` only suspends
362
+ workloads listed under `app_workloads` / `additional_workloads` in `.controlplane/controlplane.yml`; workloads
363
+ created outside that config (for example through the Control Plane UI) are left alone — see
364
+ [Pause and Resume](#pause-and-resume-with-psstop--psstart) for details.
365
+
366
+ ### Pause and Resume with `ps:stop` / `ps:start`
367
+
368
+ For review apps you want to keep but pause — for example, a long-running QA branch a tester will come back to — suspend
369
+ all workloads with:
370
+
371
+ ```sh
372
+ cpflow ps:stop -a my-app-review-123
373
+ ```
374
+
375
+ This sets `defaultOptions.suspend: true` on every workload listed under `app_workloads` or `additional_workloads` in
376
+ `.controlplane/controlplane.yml`. Workloads created outside that config (for example through the Control Plane UI) are
377
+ left alone. Resume with:
378
+
379
+ ```sh
380
+ cpflow ps:start -a my-app-review-123
381
+ ```
382
+
383
+ No re-deploy is needed; the workloads come back with the same images they had before.
384
+
385
+ > **Note:** `ps:stop` overrides serverless auto-wake. If the web workload is already serverless (`minScale: 0`),
386
+ > suspending it sets `defaultOptions.suspend: true`, and Control Plane will not bring it back on the next request —
387
+ > `ps:start` must be run explicitly first.
388
+ >
389
+ > **Note:** Sidekiq, Postgres, Redis, and Memcached templates default to `type: standard` and `minScale: 1`, so they
390
+ > keep running while only the web tier sleeps. `cpflow ps:stop -a $APP_NAME` suspends every configured workload, web
391
+ > included, and `cleanup-stale-apps --mode=stop` applies the same pause behavior to stale review apps.
392
+
147
393
  ## Useful Links
148
394
 
149
395
  - For best practices for the app's Dockerfile, see: https://lipanski.com/posts/dockerfile-ruby-best-practices
@@ -58,6 +58,14 @@ aliases:
58
58
  # it would be 'my-app-review-secrets-policy'
59
59
  secrets_policy_name: my-secrets-policy
60
60
 
61
+ # Optional: grant every app identity from this config entry access to
62
+ # existing shared org-level secret policies. Templates can reference the
63
+ # shared secret with {{SHARED_SECRET_DATABASE}}.
64
+ # shared_secret_grants:
65
+ # - name: database
66
+ # secret_name: my-shared-database-secrets
67
+ # policy_name: my-shared-database-secrets-policy
68
+
61
69
  # Configure the workload name used as a template for one-off scripts, like a Heroku one-off dyno.
62
70
  one_off_workload: rails
63
71
 
@@ -32,7 +32,7 @@ module Command
32
32
  <<~PROMPT
33
33
  Set up Control Plane GitHub Flow for this repo. Start with `cpflow github-flow-readiness` and stop on any reported blockers. The repo must be deployable from a clean clone: published package versions, complete runtime scaffold, and a production Dockerfile that can build the app. If any package version is unpublished, inaccessible from CI, or requires credentials that are not already modeled in the repo or GitHub settings, stop and report the blocker instead of generating workflow files. If the repo is a legacy sample pinned to an obsolete Ruby or Bundler toolchain, if it does not even have a production Dockerfile yet, or if it is a monorepo without an already-decided single app boundary for this flow, stop and report that as a prerequisite instead of forcing the rollout.
34
34
 
35
- If `.controlplane/` is missing, run `cpflow generate`. Treat the generated app names as the repo-name default (`#{inferred_app_prefix}`) and rename them only if the project needs a different prefix. Then run `cpflow generate-github-actions` (or `cpflow generate-github-actions --staging-branch BRANCH` when staging should deploy from a branch other than `main`/`master`), keep review apps opt-in via `+review-app-deploy`, make sure any `STAGING_APP_BRANCH` repository variable is also present in the generated staging workflow's `on.push.branches` filter, and list the GitHub secrets and variables that must be configured. Do not hand-edit duplicated upstream refs into the generated wrappers: the only downstream Control Plane Flow pin should be the reusable workflow `uses: ...@vX.Y.Z` value generated from the installed `cpflow` gem version, and upstream workflows load their matching shared actions automatically. When bumping the `cpflow` gem in a downstream repo, run `cpflow update-github-actions` (or `bundle exec cpflow update-github-actions`) and validate with `bin/test-cpflow-github-flow` in the same PR so the checked-in wrappers move to the matching release tag. Keep the standard path simple: review apps require only `CPLN_TOKEN_STAGING` when the generated review app config can be inferred. Document the one-time Control Plane bootstrap command for persistent staging and production apps with `cpflow setup-app --skip-post-creation-hook`; for existing apps or later template updates, document `cpflow apply-template` and the need for the app identity to have `reveal` on the app secret policy. Do not imply the staging deploy or promotion workflows create those persistent GVCs. For production promotion, document a protected `production` GitHub Environment with required reviewers, prevent self-review, and `CPLN_TOKEN_PRODUCTION` stored as an environment secret, not as a repository or organization secret.
35
+ If `.controlplane/` is missing, run `cpflow generate`. Treat the generated app names as the repo-name default (`#{inferred_app_prefix}`) and rename them only if the project needs a different prefix. Then run `cpflow generate-github-actions` (or `cpflow generate-github-actions --staging-branch BRANCH` when staging should deploy from a branch other than `main`/`master`), keep review apps opt-in via `+review-app-deploy`, make sure any `STAGING_APP_BRANCH` repository variable is also present in the generated staging workflow's `on.push.branches` filter, and list the GitHub secrets and variables that must be configured. Do not hand-edit duplicated upstream refs into the generated wrappers: the only downstream Control Plane Flow pin should be the reusable workflow `uses: ...@vX.Y.Z` value generated from the installed `cpflow` gem version, and upstream workflows load their matching shared actions automatically. When bumping the `cpflow` gem in a downstream repo, run `cpflow update-github-actions` (or `bundle exec cpflow update-github-actions`) and validate with `bin/test-cpflow-github-flow` in the same PR so the checked-in wrappers move to the matching release tag. Keep the standard path simple: review apps require only `CPLN_TOKEN_STAGING` when the generated review app config can be inferred. For shared review-app resources such as one staging database, use `shared_secret_grants` and `{{SHARED_SECRET_DATABASE}}` placeholders instead of hardcoding the base app secret name; this keeps review-app policy binding and cleanup automatic while avoiding per-PR database cost. Document the one-time Control Plane bootstrap command for persistent staging and production apps with `cpflow setup-app --skip-post-creation-hook`; for existing apps or later template updates, document `cpflow apply-template` and the need for the app identity to have `reveal` on the app secret policy. Do not imply the staging deploy or promotion workflows create those persistent GVCs. For production promotion, document a protected `production` GitHub Environment with required reviewers, prevent self-review, and `CPLN_TOKEN_PRODUCTION` stored as an environment secret, not as a repository or organization secret.
36
36
 
37
37
  Keep Node available in the final image if asset compilation or SSR depends on ExecJS, Yarn, `pnpm`, or npm after the main install layer. Make sure the generated Dockerfile uses a Ruby base image compatible with the app's declared Ruby requirement. Preserve repo-defined frontend build hooks: if `config/shakapacker.yml` defines a `precompile_hook`, or React on Rails enables `config.auto_load_bundle = true`, confirm the generated Dockerfile runs that codegen step before `rails assets:precompile`. If `config/database.yml` shows SQLite in production, confirm that the generated scaffold uses persistent `db` and `storage` volumes plus a release script that runs `rails db:prepare`; otherwise keep the default Postgres workload. If the public workload is not named `rails`, set `PRIMARY_WORKLOAD` or adjust the generated workflows. Inspect the Dockerfile and package sources for private GitHub dependencies or `RUN --mount=type=ssh`; if present, wire `DOCKER_BUILD_SSH_KEY`, optionally set `DOCKER_BUILD_SSH_KNOWN_HOSTS` for non-GitHub SSH hosts, and keep `DOCKER_BUILD_EXTRA_ARGS` to newline-delimited single tokens such as `--build-arg=FOO=bar`.
38
38
 
@@ -29,6 +29,9 @@ module Command
29
29
  {{APP_IMAGE_LINK}} - full link for latest app image, ready to be used for the value of `containers[].image` in the templates
30
30
  {{APP_IDENTITY}} - default identity
31
31
  {{APP_IDENTITY_LINK}} - full link for identity, ready to be used for the value of `identityLink` in the templates
32
+ {{APP_SECRETS}} - app secret dictionary name
33
+ {{APP_SECRETS_POLICY}} - app secret policy name
34
+ {{SHARED_SECRET_<NAME>}} - shared secret dictionary name from `shared_secret_grants`
32
35
  ```
33
36
  DESC
34
37
  EXAMPLES = <<~EX
data/lib/command/base.rb CHANGED
@@ -587,6 +587,75 @@ module Command
587
587
  @cp ||= Controlplane.new(config)
588
588
  end
589
589
 
590
+ def bind_shared_secret_policy_grants(grant_policy_pairs)
591
+ grant_policy_pairs.each do |grant, policy|
592
+ bind_shared_secret_policy_grant(grant, policy)
593
+ end
594
+ end
595
+
596
+ def resolve_shared_secret_policy_grants
597
+ config.shared_secret_grants.map do |grant|
598
+ [grant, resolve_shared_secret_policy_grant(grant)]
599
+ end
600
+ end
601
+
602
+ def resolve_shared_secret_policy_grant(grant)
603
+ policy = cp.fetch_policy(grant.fetch(:policy_name))
604
+ raise shared_secret_policy_missing_message(grant) if policy.nil?
605
+
606
+ ensure_shared_secret_policy_targets_secret!(grant, policy)
607
+ policy
608
+ end
609
+
610
+ def bind_shared_secret_policy_grant(grant, policy)
611
+ policy_name = grant.fetch(:policy_name)
612
+ return if identity_bound_to_policy_with_reveal?(policy)
613
+
614
+ step("Binding identity '#{config.identity}' to shared secret policy '#{policy_name}'") do
615
+ cp.bind_identity_to_policy(config.identity_link, policy_name)
616
+ end
617
+ end
618
+
619
+ def ensure_shared_secret_policy_targets_secret!(grant, policy)
620
+ return if shared_secret_policy_targets_secret?(grant, policy)
621
+
622
+ raise "Shared secret policy '#{grant.fetch(:policy_name)}' for shared_secret_grants entry " \
623
+ "'#{grant.fetch(:name)}' must target only secret '#{grant.fetch(:secret_name)}'."
624
+ end
625
+
626
+ def shared_secret_policy_targets_secret?(grant, policy)
627
+ target_links = Array(policy["targetLinks"])
628
+
629
+ policy["targetKind"] == "secret" &&
630
+ target_links.one? &&
631
+ shared_secret_policy_target_links(grant).include?(target_links.first)
632
+ end
633
+
634
+ def shared_secret_policy_target_links(grant)
635
+ secret_name = grant.fetch(:secret_name)
636
+ [
637
+ "//secret/#{secret_name}",
638
+ "/org/#{config.org}/secret/#{secret_name}"
639
+ ]
640
+ end
641
+
642
+ def identity_bound_to_policy_with_reveal?(policy)
643
+ identity_policy_permissions(policy).include?("reveal")
644
+ end
645
+
646
+ def identity_policy_permissions(policy)
647
+ Array(policy["bindings"]).flat_map do |binding|
648
+ next [] unless Array(binding["principalLinks"]).include?(config.identity_link)
649
+
650
+ Array(binding["permissions"])
651
+ end.uniq
652
+ end
653
+
654
+ def shared_secret_policy_missing_message(grant)
655
+ "Shared secret policy '#{grant.fetch(:policy_name)}' for shared_secret_grants entry " \
656
+ "'#{grant.fetch(:name)}' does not exist. Create the policy or remove the shared secret grant."
657
+ end
658
+
590
659
  def ensure_docker_running!
591
660
  result = Shell.cmd("docker", "version", capture_stderr: true)
592
661
  return if result[:success]
@@ -23,7 +23,7 @@ module Command
23
23
  DESCRIPTION = "Deletes or stops stale apps based on the latest image's creation date"
24
24
  LONG_DESCRIPTION = <<~DESC
25
25
  - Acts on stale apps based on the creation date of the latest image, or the GVC if no images exist
26
- - With `--mode=delete` (default): deletes the whole app (GVC with all workloads, all volumesets and all images), and unbinds the app from the secrets policy as long as both the identity and the policy exist (and are bound)
26
+ - With `--mode=delete` (default): deletes the whole app (GVC with all workloads, all volumesets and all images), and unbinds the app from the secrets policy and any configured `shared_secret_grants` policies as long as both the identity and each policy exist (and are bound)
27
27
  - With `--mode=stop`: suspends all workloads via `cpflow ps:stop` — no GVC, volumeset, or image is removed; resume with `cpflow ps:start`
28
28
  - `--mode=stop` only suspends workloads listed in `app_workloads` + `additional_workloads`; workloads present in the live GVC but missing from the config are skipped silently
29
29
  - `--mode=stop` returns once each workload is marked suspended; it does not wait for the workload to reach a not-ready state
@@ -12,7 +12,8 @@ module Command
12
12
  DESCRIPTION = "Deletes the whole app (GVC with all workloads, all volumesets and all images) or a specific workload"
13
13
  LONG_DESCRIPTION = <<~DESC
14
14
  - Deletes the whole app (GVC with all workloads, all volumesets and all images) or a specific workload
15
- - Also unbinds the app from the secrets policy, as long as both the identity and the policy exist (and are bound)
15
+ - Also unbinds the app from the secrets policy and any configured `shared_secret_grants` policies, as long as both the identity and each policy exist (and are bound)
16
+ - For the app-specific secrets policy, removes every permission held by the app identity; for `shared_secret_grants`, removes only `reveal`
16
17
  - Will ask for explicit user confirmation
17
18
  - Runs a pre-deletion hook before the app is deleted if `hooks.pre_deletion` is specified in the `.controlplane/controlplane.yml` file
18
19
  - If the hook exits with a non-zero code, the command will stop executing and also exit with a non-zero code
@@ -55,8 +56,11 @@ module Command
55
56
  check_images
56
57
  return unless confirm_delete(config.app)
57
58
 
59
+ # Snapshot policy state before the pre-deletion hook, so config errors surface
60
+ # before hook side effects while the hook can still use bound shared secrets.
61
+ policy_unbinds = secret_policy_unbinds
58
62
  run_pre_deletion_hook unless config.options[:skip_pre_deletion_hook]
59
- unbind_identity_from_policy
63
+ unbind_identity_from_policy(policy_unbinds)
60
64
  delete_volumesets
61
65
  delete_gvc
62
66
  delete_images
@@ -125,19 +129,90 @@ module Command
125
129
  end
126
130
  end
127
131
 
128
- def unbind_identity_from_policy
129
- return if cp.fetch_identity(config.identity).nil?
132
+ def unbind_identity_from_policy(policy_unbinds)
133
+ policy_unbinds.each do |policy_unbind|
134
+ unbind_identity_from_secret_policy(policy_unbind)
135
+ end
136
+ end
137
+
138
+ def secret_policy_unbinds
139
+ return [] if cp.fetch_identity(config.identity).nil?
140
+
141
+ [
142
+ app_secret_policy_unbind,
143
+ *shared_secret_policy_unbinds
144
+ ].compact
145
+ end
146
+
147
+ def app_secret_policy_unbind
148
+ policy_unbind_for(
149
+ config.secrets_policy,
150
+ "Unbinding identity from policy for app '#{config.app}'"
151
+ )
152
+ end
153
+
154
+ def shared_secret_policy_unbinds
155
+ config.shared_secret_grants.filter_map do |grant|
156
+ shared_secret_policy_unbind(grant)
157
+ end
158
+ end
130
159
 
131
- policy = cp.fetch_policy(config.secrets_policy)
160
+ def shared_secret_policy_unbind(grant)
161
+ policy_name = grant.fetch(:policy_name)
162
+ policy = cp.fetch_policy(policy_name)
132
163
  return if policy.nil?
133
164
 
134
- is_bound = policy["bindings"].any? do |binding|
135
- binding["principalLinks"].any? { |link| link == config.identity_link }
165
+ # cpflow only grants reveal on shared policies, so that is the only
166
+ # permission we remove from shared grants during cleanup.
167
+ return unless identity_bound_to_policy_with_reveal?(policy)
168
+
169
+ unless shared_secret_policy_targets_secret?(grant, policy)
170
+ # A drifted shared policy should not block teardown. Remove the app
171
+ # identity's reveal binding anyway so reusing the app name cannot inherit access.
172
+ warn_shared_secret_policy_target_mismatch(grant, policy_name)
136
173
  end
137
- return unless is_bound
138
174
 
139
- step("Unbinding identity from policy for app '#{config.app}'") do
140
- cp.unbind_identity_from_policy(config.identity_link, config.secrets_policy)
175
+ shared_secret_policy_unbind_data(policy_name)
176
+ end
177
+
178
+ def shared_secret_policy_unbind_data(policy_name)
179
+ {
180
+ policy_name: policy_name,
181
+ message: "Unbinding identity from shared secret policy '#{policy_name}' for app '#{config.app}'",
182
+ permissions: ["reveal"]
183
+ }
184
+ end
185
+
186
+ def warn_shared_secret_policy_target_mismatch(grant, policy_name)
187
+ progress.puts(
188
+ "Warning: unbinding identity from shared secret policy '#{policy_name}' even though it does not " \
189
+ "target configured secret '#{grant.fetch(:secret_name)}'."
190
+ )
191
+ end
192
+
193
+ def policy_unbind_for(policy_name, message)
194
+ policy = cp.fetch_policy(policy_name)
195
+ return if policy.nil?
196
+
197
+ permissions = identity_policy_permissions(policy)
198
+ return if permissions.empty?
199
+
200
+ {
201
+ policy_name: policy_name,
202
+ message: message,
203
+ permissions: permissions
204
+ }
205
+ end
206
+
207
+ def unbind_identity_from_secret_policy(policy_unbind)
208
+ policy_unbind.fetch(:permissions).each do |permission|
209
+ step("#{policy_unbind.fetch(:message)} (#{permission})") do
210
+ cp.unbind_identity_from_policy(
211
+ config.identity_link,
212
+ policy_unbind.fetch(:policy_name),
213
+ permission: permission
214
+ )
215
+ end
141
216
  end
142
217
  end
143
218