cpflow 5.0.4 → 5.1.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (50) hide show
  1. checksums.yaml +4 -4
  2. data/.github/actions/cpflow-wait-for-health/action.yml +11 -4
  3. data/.github/workflows/cpflow-promote-staging-to-production.yml +269 -43
  4. data/.github/workflows/rspec-shared.yml +8 -1
  5. data/CHANGELOG.md +28 -1
  6. data/Gemfile.lock +1 -1
  7. data/README.md +36 -11
  8. data/docs/ai-github-flow-prompt.md +1 -1
  9. data/docs/assets/logo/favicon.ico +0 -0
  10. data/docs/assets/logo/icon-1024.png +0 -0
  11. data/docs/assets/logo/icon-128.png +0 -0
  12. data/docs/assets/logo/icon-16.png +0 -0
  13. data/docs/assets/logo/icon-192.png +0 -0
  14. data/docs/assets/logo/icon-24.png +0 -0
  15. data/docs/assets/logo/icon-32.png +0 -0
  16. data/docs/assets/logo/icon-48.png +0 -0
  17. data/docs/assets/logo/icon-512.png +0 -0
  18. data/docs/assets/logo/icon-64.png +0 -0
  19. data/docs/assets/logo/icon-tile.svg +17 -0
  20. data/docs/assets/logo/mark-transparent.svg +16 -0
  21. data/docs/ci-automation.md +137 -47
  22. data/docs/commands.md +13 -3
  23. data/docs/postgres.md +6 -0
  24. data/docs/rds-private-networking.md +649 -0
  25. data/docs/secrets-and-env-values.md +49 -0
  26. data/docs/tips.md +256 -10
  27. data/examples/controlplane.yml +8 -0
  28. data/lib/command/ai_github_flow_prompt.rb +1 -1
  29. data/lib/command/apply_template.rb +3 -0
  30. data/lib/command/base.rb +69 -0
  31. data/lib/command/cleanup_stale_apps.rb +1 -1
  32. data/lib/command/delete.rb +85 -10
  33. data/lib/command/deploy_image.rb +30 -8
  34. data/lib/command/generate_github_actions.rb +6 -0
  35. data/lib/command/maintenance_off.rb +1 -0
  36. data/lib/command/maintenance_on.rb +1 -0
  37. data/lib/command/run.rb +25 -5
  38. data/lib/command/setup_app.rb +11 -2
  39. data/lib/core/config.rb +81 -0
  40. data/lib/core/controlplane.rb +15 -5
  41. data/lib/core/maintenance_mode.rb +93 -6
  42. data/lib/core/template_parser.rb +4 -0
  43. data/lib/cpflow/version.rb +1 -1
  44. data/lib/generator_templates/controlplane.yml +7 -0
  45. data/lib/generator_templates_sqlite/controlplane.yml +7 -0
  46. data/lib/github_flow_templates/.github/cpflow-help.md +48 -13
  47. data/lib/github_flow_templates/.github/workflows/cpflow-promote-staging-to-production.yml +768 -15
  48. data/lib/github_flow_templates/bin/pin-cpflow-github-ref +17 -3
  49. data/lib/github_flow_templates/bin/test-cpflow-github-flow +61 -9
  50. metadata +15 -2
@@ -17,16 +17,32 @@ module Command
17
17
  - The release script is run in the context of `cpflow run` with the latest image
18
18
  - If the release script exits with a non-zero code, the command will stop executing and also exit with a non-zero code
19
19
  - If `use_digest_image_ref` is `true` in the `.controlplane/controlplane.yml` file or `--use-digest-image-ref` option is provided, deployed image's reference will include its digest
20
+ - Repairs missing `shared_secret_grants` policy bindings before running a release phase or updating workloads
20
21
  DESC
21
22
 
22
- def call # rubocop:disable Metrics/MethodLength
23
- run_release_script if config.options[:run_release_phase]
23
+ def call
24
+ release_script = release_script_to_run
25
+ image = resolve_image_to_deploy
26
+ shared_secret_policy_grant_pairs = resolve_shared_secret_policy_grants
27
+ workload_data_by_name = app_workload_data
28
+
29
+ bind_shared_secret_policy_grants(shared_secret_policy_grant_pairs)
30
+ run_release_script(release_script) if release_script
31
+ deploy_image_to_workloads(image, workload_data_by_name)
32
+ end
33
+
34
+ private
35
+
36
+ def app_workload_data
37
+ config[:app_workloads].to_h do |workload|
38
+ [workload, cp.fetch_workload!(workload)]
39
+ end
40
+ end
24
41
 
42
+ def deploy_image_to_workloads(image, workload_data_by_name) # rubocop:disable Metrics/MethodLength
25
43
  deployed_endpoints = {}
26
- image = resolve_image_to_deploy
27
44
 
28
- config[:app_workloads].each do |workload|
29
- workload_data = cp.fetch_workload!(workload)
45
+ workload_data_by_name.each do |workload, workload_data|
30
46
  workload_data.dig("spec", "containers").each do |container|
31
47
  next unless container["image"].match?(%r{^/org/#{config.org}/image/#{config.app}[:@]})
32
48
 
@@ -47,8 +63,6 @@ module Command
47
63
  end
48
64
  end
49
65
 
50
- private
51
-
52
66
  def resolve_image_to_deploy
53
67
  image = cp.latest_image
54
68
  # Preserve the pre-existing fail-fast check so missing images are reported
@@ -93,8 +107,16 @@ module Command
93
107
  deployments.dig("items", 0, "status", "endpoint")
94
108
  end
95
109
 
96
- def run_release_script
110
+ def release_script_to_run
111
+ return unless config.options[:run_release_phase]
112
+
97
113
  release_script = config[:release_script]
114
+ return release_script if release_script.is_a?(String) && !release_script.strip.empty?
115
+
116
+ raise "release_script must be configured when --run-release-phase is provided."
117
+ end
118
+
119
+ def run_release_script(release_script)
98
120
  run_command_in_latest_image(release_script, title: "release script")
99
121
  end
100
122
  end
@@ -42,6 +42,7 @@ module Command
42
42
  def template_variables
43
43
  {
44
44
  "__CPFLOW_GITHUB_ACTIONS_REF__" => cpflow_github_actions_ref,
45
+ "__CPFLOW_MINOR_SERIES__" => cpflow_minor_series,
45
46
  "__STAGING_BRANCH_FILTER__" => staging_branch_filter,
46
47
  "__STAGING_BRANCH_DEFAULT__" => staging_branch_default
47
48
  }
@@ -78,6 +79,11 @@ module Command
78
79
  def default_cpflow_github_actions_ref
79
80
  "v#{::Cpflow::VERSION}"
80
81
  end
82
+
83
+ # Returns e.g. "5.0.x" for the version-locking placeholder in cpflow-help.md.
84
+ def cpflow_minor_series
85
+ "#{::Cpflow::VERSION.split('.').first(2).join('.')}.x"
86
+ end
81
87
  end
82
88
 
83
89
  class GenerateGithubActions < Base
@@ -10,6 +10,7 @@ module Command
10
10
  DESCRIPTION = "Disables maintenance mode for an app"
11
11
  LONG_DESCRIPTION = <<~DESC
12
12
  - Disables maintenance mode for an app
13
+ - Safe to re-run: if a previous run timed out after switching the domain but before stopping the maintenance workload, re-running while maintenance mode is already disabled stops the maintenance workload to finish it (so it is not a pure no-op)
13
14
  - Specify the one-off workload through `one_off_workload` in the `.controlplane/controlplane.yml` file
14
15
  - Optionally specify the maintenance workload through `maintenance_workload` in the `.controlplane/controlplane.yml` file (defaults to 'maintenance')
15
16
  - Maintenance mode is only supported for domains that use path based routing mode and have a route configured for the prefix '/' on either port 80 or 443
@@ -10,6 +10,7 @@ module Command
10
10
  DESCRIPTION = "Enables maintenance mode for an app"
11
11
  LONG_DESCRIPTION = <<~DESC
12
12
  - Enables maintenance mode for an app
13
+ - Safe to re-run: if a previous run timed out after switching the domain but before stopping the app workloads, re-running while maintenance mode is already enabled stops the app workloads to finish it (so it is not a pure no-op)
13
14
  - Specify the one-off workload through `one_off_workload` in the `.controlplane/controlplane.yml` file
14
15
  - Optionally specify the maintenance workload through `maintenance_workload` in the `.controlplane/controlplane.yml` file (defaults to 'maintenance')
15
16
  - Maintenance mode is only supported for domains that use path based routing mode and have a route configured for the prefix '/' on either port 80 or 443
data/lib/command/run.rb CHANGED
@@ -47,6 +47,8 @@ module Command
47
47
  and also overridden per job through `--cpu` and `--memory`)
48
48
  - By default, the job is stopped if it takes longer than 6 hours to finish
49
49
  (can be configured though `runner_job_timeout` in `controlplane.yml`)
50
+ - Non-interactive jobs return the Control Plane cron job status even when the job finishes before
51
+ Control Plane exposes a runner replica to attach logs to
50
52
  DESC
51
53
  EXAMPLES = <<~EX.freeze
52
54
  ```sh
@@ -97,7 +99,7 @@ module Command
97
99
 
98
100
  attr_reader :interactive, :detached, :location, :original_workload, :runner_workload,
99
101
  :default_image, :default_cpu, :default_memory, :job_timeout, :job_history_limit,
100
- :container, :job, :replica, :command
102
+ :container, :job, :replica, :command, :job_completed_before_replica_exit_status
101
103
 
102
104
  def call # rubocop:disable Metrics/CyclomaticComplexity, Metrics/MethodLength, Metrics/PerceivedComplexity
103
105
  @interactive = config.options[:interactive] || interactive_command?
@@ -129,6 +131,7 @@ module Command
129
131
  update_runner_workload
130
132
  start_job
131
133
  wait_for_replica_for_job
134
+ exit(job_completed_before_replica_exit_status) if job_completed_before_replica_exit_status
132
135
 
133
136
  progress.puts
134
137
  if interactive
@@ -269,7 +272,20 @@ module Command
269
272
  result = cp.fetch_workload_replicas(runner_workload, location: location)
270
273
  @replica = result&.dig("items")&.find { |item| item.include?(job) }
271
274
 
272
- replica || false
275
+ replica || completed_job_before_replica? || false
276
+ end
277
+ end
278
+
279
+ def completed_job_before_replica?
280
+ case current_job_status
281
+ when "successful"
282
+ @job_completed_before_replica_exit_status = ExitCode::SUCCESS
283
+ true
284
+ when nil, "active", "pending"
285
+ false
286
+ else
287
+ @job_completed_before_replica_exit_status = ExitCode::ERROR_DEFAULT
288
+ true
273
289
  end
274
290
  end
275
291
 
@@ -505,9 +521,7 @@ module Command
505
521
 
506
522
  def resolve_job_status # rubocop:disable Metrics/MethodLength
507
523
  loop do
508
- result = cp.fetch_cron_workload(runner_workload, location: location)
509
- job_details = result&.dig("items")&.find { |item| item["id"] == job }
510
- status = job_details&.dig("status")
524
+ status = current_job_status
511
525
 
512
526
  Shell.debug("JOB STATUS", status)
513
527
 
@@ -522,6 +536,12 @@ module Command
522
536
  end
523
537
  end
524
538
 
539
+ def current_job_status
540
+ result = cp.fetch_cron_workload(runner_workload, location: location)
541
+ job_details = result&.dig("items")&.find { |item| item["id"] == job }
542
+ job_details&.dig("status")
543
+ end
544
+
525
545
  ###########################################
526
546
  ### temporary extaction from run:detached
527
547
  ###########################################
@@ -17,6 +17,7 @@ module Command
17
17
  - Configures app to have org-level secrets with default name `"{APP_PREFIX}-secrets"`
18
18
  using org-level policy with default name `"{APP_PREFIX}-secrets-policy"` (names can be customized, see docs)
19
19
  - Creates identity for secrets if it does not exist
20
+ - Binds the app identity to any configured `shared_secret_grants` policies as part of the secrets setup flow; skipped when `--skip-secrets-setup` or `--skip-secret-access-binding` is provided, or `skip_secrets_setup` is set
20
21
  - Use `--skip-secrets-setup` to prevent the automatic setup of secrets,
21
22
  or set it through `skip_secrets_setup` in the `.controlplane/controlplane.yml` file
22
23
  - Runs a post-creation hook after the app is created if `hooks.post_creation` is specified in the `.controlplane/controlplane.yml` file
@@ -35,9 +36,11 @@ module Command
35
36
  "or run 'cpflow apply-template #{templates.join(' ')} -a #{config.app}'."
36
37
  end
37
38
 
38
- skip_secrets_setup = config.options[:skip_secret_access_binding] ||
39
- config.options[:skip_secrets_setup] || config.current[:skip_secrets_setup]
39
+ skip_secrets_setup = skip_secrets_setup?
40
40
 
41
+ # Validate shared grants before app resource creation so config/policy
42
+ # drift does not leave a partially-created review app.
43
+ shared_secret_policy_grant_pairs = resolve_shared_secret_policy_grants unless skip_secrets_setup
41
44
  create_secret_and_policy_if_not_exist unless skip_secrets_setup
42
45
 
43
46
  args = []
@@ -45,11 +48,17 @@ module Command
45
48
  run_cpflow_command("apply-template", *templates, "-a", config.app, *args)
46
49
 
47
50
  bind_identity_to_policy unless skip_secrets_setup
51
+ bind_shared_secret_policy_grants(shared_secret_policy_grant_pairs) unless skip_secrets_setup
48
52
  run_post_creation_hook unless config.options[:skip_post_creation_hook]
49
53
  end
50
54
 
51
55
  private
52
56
 
57
+ def skip_secrets_setup?
58
+ config.options[:skip_secret_access_binding] ||
59
+ config.options[:skip_secrets_setup] || config.current[:skip_secrets_setup]
60
+ end
61
+
53
62
  def create_secret_and_policy_if_not_exist
54
63
  create_secret_if_not_exists
55
64
  create_policy_if_not_exists
data/lib/core/config.rb CHANGED
@@ -10,6 +10,9 @@ class Config # rubocop:disable Metrics/ClassLength
10
10
  include Helpers
11
11
 
12
12
  CONFIG_FILE_LOCATION = ".controlplane/controlplane.yml"
13
+ REQUIRED_SHARED_SECRET_GRANT_KEYS = %i[name secret_name policy_name].freeze
14
+ SHARED_SECRET_RESOURCE_NAME_KEYS = %i[secret_name policy_name].freeze
15
+ CONTROL_PLANE_RESOURCE_NAME_REGEX = /\A[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\z/
13
16
 
14
17
  def initialize(args, options, required_options)
15
18
  @args = args
@@ -56,6 +59,16 @@ class Config # rubocop:disable Metrics/ClassLength
56
59
  current&.dig(:secrets_policy_name) || "#{secrets}-policy"
57
60
  end
58
61
 
62
+ def shared_secret_grants
63
+ @shared_secret_grants ||= normalize_shared_secret_grants(current&.dig(:shared_secret_grants))
64
+ end
65
+
66
+ def shared_secret_placeholders
67
+ shared_secret_grants.to_h do |grant|
68
+ ["{{SHARED_SECRET_#{grant.fetch(:name).upcase}}}", grant.fetch(:secret_name)]
69
+ end
70
+ end
71
+
59
72
  def location
60
73
  @location ||= load_location_from_options || load_location_from_env || load_location_from_file
61
74
  end
@@ -171,6 +184,74 @@ class Config # rubocop:disable Metrics/ClassLength
171
184
  raise "Can't find config for app '#{app_name}' in 'controlplane.yml'." unless app_options
172
185
  end
173
186
 
187
+ def normalize_shared_secret_grants(grants)
188
+ return [] if grants.nil?
189
+
190
+ raise "shared_secret_grants for app config must be an array." unless grants.is_a?(Array)
191
+
192
+ normalized_grants = grants.map.with_index { |grant, index| normalize_shared_secret_grant(grant, index) }
193
+ ensure_unique_shared_secret_grant_names!(normalized_grants)
194
+ normalized_grants
195
+ end
196
+
197
+ def normalize_shared_secret_grant(raw_grant, index)
198
+ ensure_shared_secret_grant_map!(raw_grant, index)
199
+
200
+ grant = raw_grant.transform_keys(&:to_sym)
201
+ label = grant[:name] || "##{index + 1}"
202
+ ensure_shared_secret_grant_keys!(grant, label)
203
+ ensure_shared_secret_resource_names!(grant, label)
204
+ build_shared_secret_grant(grant)
205
+ end
206
+
207
+ def build_shared_secret_grant(grant)
208
+ name = grant.fetch(:name).to_s
209
+ ensure_shared_secret_grant_name!(name)
210
+ {
211
+ name: name,
212
+ secret_name: grant.fetch(:secret_name).to_s,
213
+ policy_name: grant.fetch(:policy_name).to_s
214
+ }
215
+ end
216
+
217
+ def ensure_shared_secret_grant_map!(raw_grant, index)
218
+ return if raw_grant.is_a?(Hash)
219
+
220
+ raise "shared_secret_grants entry ##{index + 1} must be a map."
221
+ end
222
+
223
+ def ensure_shared_secret_grant_keys!(grant, label)
224
+ REQUIRED_SHARED_SECRET_GRANT_KEYS.each do |key|
225
+ value = grant[key]
226
+ raise "shared_secret_grants entry '#{label}' must include #{key}." if value.nil? || value.to_s.empty?
227
+ end
228
+ end
229
+
230
+ def ensure_shared_secret_grant_name!(name)
231
+ return if name.match?(/\A[a-z](?:[a-z0-9_]*[a-z0-9])?\z/)
232
+
233
+ raise "shared_secret_grants entry name '#{name}' must be lower snake case."
234
+ end
235
+
236
+ def ensure_shared_secret_resource_names!(grant, label)
237
+ SHARED_SECRET_RESOURCE_NAME_KEYS.each do |key|
238
+ value = grant.fetch(key).to_s
239
+ next if value.match?(CONTROL_PLANE_RESOURCE_NAME_REGEX)
240
+
241
+ raise "shared_secret_grants entry '#{label}' #{key} '#{value}' must be a Control Plane resource name."
242
+ end
243
+ end
244
+
245
+ def ensure_unique_shared_secret_grant_names!(grants)
246
+ seen_names = {}
247
+ grants.each do |grant|
248
+ name = grant.fetch(:name)
249
+ raise "shared_secret_grants entry name '#{name}' must be unique." if seen_names[name]
250
+
251
+ seen_names[name] = true
252
+ end
253
+ end
254
+
174
255
  def ensure_app!
175
256
  return if app
176
257
 
@@ -396,13 +396,23 @@ class Controlplane # rubocop:disable Metrics/ClassLength
396
396
  end
397
397
 
398
398
  def bind_identity_to_policy(identity_link, policy)
399
- cmd = "cpln policy add-binding #{policy} --org #{org} --identity #{identity_link} --permission reveal"
400
- perform!(cmd)
399
+ cmd = [
400
+ "cpln", "policy", "add-binding", policy,
401
+ "--org", org,
402
+ "--identity", identity_link,
403
+ "--permission", "reveal"
404
+ ]
405
+ perform!(Shellwords.join(cmd))
401
406
  end
402
407
 
403
- def unbind_identity_from_policy(identity_link, policy)
404
- cmd = "cpln policy remove-binding #{policy} --org #{org} --identity #{identity_link} --permission reveal"
405
- perform!(cmd)
408
+ def unbind_identity_from_policy(identity_link, policy, permission: "reveal")
409
+ cmd = [
410
+ "cpln", "policy", "remove-binding", policy,
411
+ "--org", org,
412
+ "--identity", identity_link,
413
+ "--permission", permission
414
+ ]
415
+ perform!(Shellwords.join(cmd))
406
416
  end
407
417
 
408
418
  # apply
@@ -1,8 +1,19 @@
1
1
  # frozen_string_literal: true
2
2
 
3
- class MaintenanceMode
3
+ class MaintenanceMode # rubocop:disable Metrics/ClassLength
4
4
  extend Forwardable
5
5
 
6
+ DOMAIN_WORKLOAD_UPDATE_MAX_POLL_ATTEMPTS = 30
7
+ DOMAIN_WORKLOAD_UPDATE_RETRY_WAIT_SECONDS = 1
8
+ DOMAIN_WORKLOAD_UPDATE_STEP_OPTIONS = {
9
+ retry_on_failure: true,
10
+ # `with_retry` loops while `retry_count <= max_retry_count` starting from 0, so
11
+ # total attempts == max_retry_count + 1. Subtract 1 so the bounded poll runs
12
+ # exactly DOMAIN_WORKLOAD_UPDATE_MAX_POLL_ATTEMPTS times.
13
+ max_retry_count: DOMAIN_WORKLOAD_UPDATE_MAX_POLL_ATTEMPTS - 1,
14
+ wait: DOMAIN_WORKLOAD_UPDATE_RETRY_WAIT_SECONDS
15
+ }.freeze
16
+
6
17
  def_delegators :@command, :config, :progress, :cp, :step, :run_cpflow_command
7
18
 
8
19
  def initialize(command)
@@ -22,6 +33,7 @@ class MaintenanceMode
22
33
  def enable!
23
34
  if enabled?
24
35
  progress.puts("Maintenance mode is already enabled for app '#{config.app}'.")
36
+ ensure_app_workloads_stopped
25
37
  else
26
38
  enable_maintenance_mode
27
39
  end
@@ -30,6 +42,7 @@ class MaintenanceMode
30
42
  def disable!
31
43
  if disabled?
32
44
  progress.puts("Maintenance mode is already disabled for app '#{config.app}'.")
45
+ ensure_maintenance_workload_stopped
33
46
  else
34
47
  disable_maintenance_mode
35
48
  end
@@ -69,6 +82,28 @@ class MaintenanceMode
69
82
  cp.fetch_workload!(maintenance_workload)
70
83
  end
71
84
 
85
+ # A run that already switched the route but hit the poll timeout aborts before
86
+ # its final workload-stop step runs. The next `enable!`/`disable!` short-circuits
87
+ # on the route check, so do the matching stop here — once the route is on the
88
+ # target, this brings the workloads into the state that route implies. `ps:stop`
89
+ # is idempotent, so each is a no-op once the target workload is already stopped.
90
+ #
91
+ # The stop target differs by direction. `ps:stop -a` covers only
92
+ # `app_workloads` + `additional_workloads`, never the maintenance workload:
93
+ # - enable!: the route now points at the maintenance workload, so the *app*
94
+ # workloads are the ones left running and `ps:stop -a` is correct.
95
+ # - disable!: the route now points at the app workloads (and a short-circuit
96
+ # `disable!` can run on an app whose app workloads are serving live traffic),
97
+ # so stopping all workloads would cause an outage. The workload a timed-out
98
+ # `disable!` leaves running is the maintenance workload, so stop only that.
99
+ def ensure_app_workloads_stopped
100
+ start_or_stop_all_workloads(:stop)
101
+ end
102
+
103
+ def ensure_maintenance_workload_stopped
104
+ start_or_stop_maintenance_workload(:stop)
105
+ end
106
+
72
107
  def start_or_stop_all_workloads(action)
73
108
  run_cpflow_command("ps:#{action}", "-a", config.app, "--wait")
74
109
 
@@ -82,16 +117,68 @@ class MaintenanceMode
82
117
  end
83
118
 
84
119
  def switch_domain_workload(to:)
85
- step("Switching workload for domain '#{domain_data['name']}' to '#{to}'") do
86
- cp.set_domain_workload(domain_data, to)
87
-
88
- # Give it a bit of time for the domain to update
89
- Kernel.sleep(30)
120
+ domain_name = domain_data["name"]
121
+
122
+ # Unlike the polling step below, the switch request is intentionally not
123
+ # retried: if it fails, nothing has changed yet, so aborting and letting the
124
+ # user re-run is the safe outcome. (Retrying would not help here anyway —
125
+ # `with_retry` retries on a falsy return, and `set_domain_workload` raises
126
+ # rather than returning false.)
127
+ step("Requesting workload switch for domain '#{domain_name}' to '#{to}'") do
128
+ # `set_domain_workload` mutates the route in place, so send a deep copy
129
+ # (round-tripped through JSON, since the domain is plain parsed-API data
130
+ # with string keys and JSON-native values) to keep the cached
131
+ # `@domain_data` reflecting the real server route. The poll re-fetches and
132
+ # matches on that fresh data, but if every poll times out without a routable
133
+ # fetch, `@domain_data` is what a re-run's `enabled?`/`disabled?` check reads
134
+ # — mutating it here would make that check report the requested route, not
135
+ # the actual one.
136
+ domain_data_for_update = JSON.parse(JSON.generate(domain_data))
137
+ cp.set_domain_workload(domain_data_for_update, to)
90
138
  end
91
139
 
140
+ wait_for_domain_workload_switch(domain_name, to)
141
+
92
142
  progress.puts
93
143
  end
94
144
 
145
+ # If the route never switches within the bounded poll window, this step aborts
146
+ # (abort_on_error) before any workloads are stopped, so traffic stays on the
147
+ # current workload. The label tells the user how to recover, since an exhausted
148
+ # poll has no error message of its own to print.
149
+ def wait_for_domain_workload_switch(domain_name, to)
150
+ @last_poll_error = nil # reset the poll-error dedup state for this poll window
151
+ step("Waiting for domain '#{domain_name}' workload to switch to '#{to}' " \
152
+ "(re-run this command if it times out)", **DOMAIN_WORKLOAD_UPDATE_STEP_OPTIONS) do
153
+ domain_workload_update_confirmed?(domain_name, to)
154
+ end
155
+ end
156
+
157
+ # Refetches the domain, refreshes the cached `@domain_data` when the fetch
158
+ # returns a routable domain, and reports whether the route now points at
159
+ # `workload`. Any error — a 5xx mid-propagation, a transient 403
160
+ # (`ForbiddenError < StandardError`, not a `RuntimeError`), or a network blip —
161
+ # is treated as "not switched yet" so the poll keeps retrying. The broad rescue
162
+ # logs the error to the step's stderr, so a latent bug (e.g. `NoMethodError`)
163
+ # surfaces in the "failed!" output on timeout instead of being swallowed.
164
+ def domain_workload_update_confirmed?(domain_name, workload)
165
+ refreshed_domain_data = cp.fetch_domain(domain_name)
166
+ @domain_data = refreshed_domain_data if refreshed_domain_data
167
+ refreshed_domain_data && cp.domain_workload_matches?(refreshed_domain_data, workload)
168
+ rescue StandardError => e
169
+ # A persistent failure (bad domain name, network outage, a latent bug) repeats
170
+ # the same error on every poll attempt, so only log when the message changes —
171
+ # otherwise the timeout output would carry up to MAX_POLL_ATTEMPTS identical
172
+ # lines. Guard on `tmp_stderr` so this stays safe if ever called outside a
173
+ # `step` block, where no tmp stderr is set up.
174
+ message = "#{e.class}: #{e.message} (#{e.backtrace&.first})\n"
175
+ if message != @last_poll_error && Shell.tmp_stderr
176
+ Shell.write_to_tmp_stderr(message)
177
+ @last_poll_error = message
178
+ end
179
+ false
180
+ end
181
+
95
182
  def domain_data
96
183
  @domain_data ||=
97
184
  if config.domain
@@ -49,6 +49,10 @@ class TemplateParser
49
49
  .gsub("{{APP_SECRETS}}", config.secrets)
50
50
  .gsub("{{APP_SECRETS_POLICY}}", config.secrets_policy)
51
51
 
52
+ config.shared_secret_placeholders.each do |placeholder, secret_name|
53
+ yaml_file = yaml_file.gsub(placeholder, secret_name)
54
+ end
55
+
52
56
  find_deprecated_variables(yaml_file)
53
57
 
54
58
  # Kept for backwards compatibility
@@ -1,6 +1,6 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module Cpflow
4
- VERSION = "5.0.4"
4
+ VERSION = "5.1.1"
5
5
  MIN_CPLN_VERSION = "3.1.0"
6
6
  end
@@ -41,6 +41,13 @@ apps:
41
41
  __APP_PREFIX__-review:
42
42
  <<: *common
43
43
  match_if_app_name_starts_with: true
44
+ # To save review-app database cost, create one shared staging database
45
+ # secret and policy, then uncomment this block and use
46
+ # {{SHARED_SECRET_DATABASE}} in templates that need DATABASE_URL.
47
+ # shared_secret_grants:
48
+ # - name: database
49
+ # secret_name: __APP_PREFIX__-review-database-secrets
50
+ # policy_name: __APP_PREFIX__-review-database-secrets-policy
44
51
  # Uncomment to automatically initialize and tear down review-app databases:
45
52
  # hooks:
46
53
  # post_creation: bundle exec rails db:prepare
@@ -41,6 +41,13 @@ apps:
41
41
  __APP_PREFIX__-review:
42
42
  <<: *common
43
43
  match_if_app_name_starts_with: true
44
+ # Optional: if a review app needs an existing shared org-level secret,
45
+ # declare its policy here and reference it in templates with
46
+ # {{SHARED_SECRET_<NAME>}}.
47
+ # shared_secret_grants:
48
+ # - name: database
49
+ # secret_name: __APP_PREFIX__-review-database-secrets
50
+ # policy_name: __APP_PREFIX__-review-database-secrets-policy
44
51
 
45
52
  __APP_PREFIX__-production:
46
53
  <<: *production
@@ -23,11 +23,23 @@ For the normal generated review-app path, GitHub needs one repository secret:
23
23
  | --- | --- | --- |
24
24
  | `CPLN_TOKEN_STAGING` | Repository secret | Control Plane service-account token for the staging/review org. |
25
25
 
26
+ For public repositories, use a staging/review token that cannot access
27
+ production Control Plane resources. Generated review-app deploys skip fork PR
28
+ heads because Docker builds use repository secrets. If a forked change needs a
29
+ review app, first move the reviewed change to a trusted branch in this
30
+ repository.
31
+
26
32
  No repository variables are required for the standard review-app path when
27
33
  `.controlplane/controlplane.yml` has exactly one review app entry with
28
34
  `match_if_app_name_starts_with: true`. cpflow infers the review-app prefix and
29
35
  staging org from that config.
30
36
 
37
+ Review apps run pull request code. Any value mounted through
38
+ `cpln://secret/...` can be read by that code after the workload starts, so keep
39
+ review-app secret dictionaries limited to disposable databases, review-only
40
+ renderer credentials, and license values that are acceptable for review-app
41
+ exposure.
42
+
31
43
  Optional overrides exist for forks, clones, and unusual apps:
32
44
 
33
45
  | Name | Notes |
@@ -66,27 +78,50 @@ Production promotion is part of the generated flow, but keep it protected:
66
78
  | `PRODUCTION_APP_NAME` | Prefer `production` Environment variable | Production app name from `controlplane.yml`. |
67
79
 
68
80
  Configure the `production` GitHub Environment with required reviewers and
69
- prevent self-review. The generated promotion wrapper passes only the staging
70
- token from repository secrets; GitHub injects `CPLN_TOKEN_PRODUCTION` only after
71
- the environment approval gate passes.
81
+ prevent self-review. Production promotion intentionally runs as a normal
82
+ caller-repo workflow job with `environment: production`, then checks out the
83
+ pinned `control-plane-flow` release for shared actions. Do not move production
84
+ promotion behind a cross-repo reusable workflow: GitHub does not expose this
85
+ repo's environment secrets to that called workflow.
86
+
87
+ Keep `CPLN_TOKEN_PRODUCTION` absent from repository and organization secrets. A
88
+ normal environment-gated job cannot tell which secret scope supplied a nonempty
89
+ value, so a broader secret with the same name can mask a missing environment
90
+ secret.
91
+
92
+ If the promotion workflow fails with
93
+ `CPLN_TOKEN_PRODUCTION is not set. Add it as a secret on the 'production' GitHub Environment.`,
94
+ the token is missing from the environment scope or the workflow job is no longer
95
+ declaring `environment: production`. Create or verify the environment secret
96
+ and confirm there is no same-named repository or organization secret:
97
+ You need permission to manage repository environments and secrets to run these
98
+ commands.
99
+
100
+ ```sh
101
+ gh secret set CPLN_TOKEN_PRODUCTION --repo OWNER/REPO --env production
102
+ # Paste the token value when prompted.
103
+ gh secret list --repo OWNER/REPO --env production
104
+ gh secret list --repo OWNER/REPO
105
+ gh secret list --org OWNER | grep '^CPLN_TOKEN_PRODUCTION[[:space:]]' || true
106
+ ```
72
107
 
73
108
  Before the first promotion, bootstrap the production app the same way in the
74
109
  production org, using production-only secrets and values.
75
110
 
76
111
  ## Version Locking
77
112
 
78
- Generated wrappers pin Control Plane Flow once with the reusable workflow
79
- `uses:` ref, for example `@__CPFLOW_GITHUB_ACTIONS_REF__`. For stable releases,
80
- this ref should be a release tag. The upstream reusable workflow automatically
81
- loads its matching shared actions from GitHub's workflow context, so downstream
82
- wrappers should not pass a duplicate Control Plane Flow ref input. If your
83
- generated wrappers still include a `with:` block whose only purpose is to repeat
84
- the same ref, regenerate them with a newer `cpflow`.
113
+ Generated wrappers pin Control Plane Flow with a release tag, for example
114
+ `__CPFLOW_GITHUB_ACTIONS_REF__`. Reusable review-app, staging, cleanup, and
115
+ helper workflows pin the tag in their `uses:` ref. Production promotion pins
116
+ the same tag in the `Checkout control-plane-flow actions` step so the
117
+ caller-owned job can keep `environment: production` and receive production
118
+ environment secrets directly.
85
119
 
86
120
  Leave `CPFLOW_VERSION` unset so the workflow builds cpflow from the same
87
121
  checked-out upstream source. If you set `CPFLOW_VERSION`, it must match the
88
- release tag, for example `CPFLOW_VERSION=5.0.1` with a wrapper pinned to
89
- `uses: ...@v5.0.1`.
122
+ release tag your wrappers are pinned to: a `CPFLOW_VERSION=__CPFLOW_MINOR_SERIES__` runtime
123
+ override goes with a wrapper pinned to `uses: ...@v__CPFLOW_MINOR_SERIES__` (substitute the
124
+ release you pinned above).
90
125
 
91
126
  After updating the `cpflow` gem in this repo, update the generated wrappers in
92
127
  the same PR:
@@ -119,7 +154,7 @@ Most apps do not need these:
119
154
  | Name | Notes |
120
155
  | --- | --- |
121
156
  | `DOCKER_BUILD_EXTRA_ARGS` | Newline-delimited extra Docker build tokens. |
122
- | `DOCKER_BUILD_SSH_KEY` | Private SSH key for Docker builds that fetch private dependencies. |
157
+ | `DOCKER_BUILD_SSH_KEY` | Read-only, revocable deploy key for Docker builds that fetch private dependencies. Do not use a personal SSH key. |
123
158
  | `DOCKER_BUILD_SSH_KNOWN_HOSTS` | SSH known_hosts entries when SSH build hosts are not GitHub.com. |
124
159
  | `REVIEW_APP_DEPLOYING_ICON_URL` | Cosmetic custom image URL for the animated deploying icon. Set to `none` to use the text fallback icon. |
125
160
  | `STAGING_APP_BRANCH` | Custom staging branch. The branch must also appear in `cpflow-deploy-staging.yml`'s push filter. |