cpflow 3.0.0
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +7 -0
- data/.github/workflows/check_cpln_links.yml +19 -0
- data/.github/workflows/command_docs.yml +24 -0
- data/.github/workflows/rspec-shared.yml +56 -0
- data/.github/workflows/rspec.yml +28 -0
- data/.github/workflows/rubocop.yml +24 -0
- data/.gitignore +18 -0
- data/.overcommit.yml +16 -0
- data/.rubocop.yml +22 -0
- data/.simplecov_spawn.rb +10 -0
- data/CHANGELOG.md +259 -0
- data/CONTRIBUTING.md +73 -0
- data/Gemfile +7 -0
- data/Gemfile.lock +126 -0
- data/LICENSE +21 -0
- data/README.md +546 -0
- data/Rakefile +21 -0
- data/bin/cpflow +6 -0
- data/cpflow +6 -0
- data/cpflow.gemspec +41 -0
- data/docs/assets/grafana-alert.png +0 -0
- data/docs/assets/memcached.png +0 -0
- data/docs/assets/sidekiq-pre-stop-hook.png +0 -0
- data/docs/commands.md +454 -0
- data/docs/dns.md +15 -0
- data/docs/migrating.md +262 -0
- data/docs/postgres.md +436 -0
- data/docs/redis.md +128 -0
- data/docs/secrets-and-env-values.md +42 -0
- data/docs/tips.md +150 -0
- data/docs/troubleshooting.md +6 -0
- data/examples/circleci.yml +104 -0
- data/examples/controlplane.yml +159 -0
- data/lib/command/apply_template.rb +209 -0
- data/lib/command/base.rb +540 -0
- data/lib/command/build_image.rb +49 -0
- data/lib/command/cleanup_images.rb +136 -0
- data/lib/command/cleanup_stale_apps.rb +79 -0
- data/lib/command/config.rb +48 -0
- data/lib/command/copy_image_from_upstream.rb +108 -0
- data/lib/command/delete.rb +149 -0
- data/lib/command/deploy_image.rb +56 -0
- data/lib/command/doctor.rb +47 -0
- data/lib/command/env.rb +22 -0
- data/lib/command/exists.rb +23 -0
- data/lib/command/generate.rb +45 -0
- data/lib/command/info.rb +222 -0
- data/lib/command/latest_image.rb +19 -0
- data/lib/command/logs.rb +49 -0
- data/lib/command/maintenance.rb +42 -0
- data/lib/command/maintenance_off.rb +62 -0
- data/lib/command/maintenance_on.rb +62 -0
- data/lib/command/maintenance_set_page.rb +34 -0
- data/lib/command/no_command.rb +23 -0
- data/lib/command/open.rb +33 -0
- data/lib/command/open_console.rb +26 -0
- data/lib/command/promote_app_from_upstream.rb +38 -0
- data/lib/command/ps.rb +41 -0
- data/lib/command/ps_restart.rb +37 -0
- data/lib/command/ps_start.rb +51 -0
- data/lib/command/ps_stop.rb +82 -0
- data/lib/command/ps_wait.rb +40 -0
- data/lib/command/run.rb +573 -0
- data/lib/command/setup_app.rb +113 -0
- data/lib/command/test.rb +23 -0
- data/lib/command/version.rb +18 -0
- data/lib/constants/exit_code.rb +7 -0
- data/lib/core/config.rb +316 -0
- data/lib/core/controlplane.rb +552 -0
- data/lib/core/controlplane_api.rb +170 -0
- data/lib/core/controlplane_api_direct.rb +112 -0
- data/lib/core/doctor_service.rb +104 -0
- data/lib/core/helpers.rb +26 -0
- data/lib/core/shell.rb +100 -0
- data/lib/core/template_parser.rb +76 -0
- data/lib/cpflow/version.rb +6 -0
- data/lib/cpflow.rb +288 -0
- data/lib/deprecated_commands.json +9 -0
- data/lib/generator_templates/Dockerfile +27 -0
- data/lib/generator_templates/controlplane.yml +62 -0
- data/lib/generator_templates/entrypoint.sh +8 -0
- data/lib/generator_templates/templates/app.yml +21 -0
- data/lib/generator_templates/templates/postgres.yml +176 -0
- data/lib/generator_templates/templates/rails.yml +36 -0
- data/rakelib/create_release.rake +81 -0
- data/script/add_command +37 -0
- data/script/check_command_docs +3 -0
- data/script/check_cpln_links +45 -0
- data/script/rename_command +43 -0
- data/script/update_command_docs +62 -0
- data/templates/app.yml +13 -0
- data/templates/daily-task.yml +32 -0
- data/templates/maintenance.yml +25 -0
- data/templates/memcached.yml +24 -0
- data/templates/postgres.yml +32 -0
- data/templates/rails.yml +27 -0
- data/templates/redis.yml +21 -0
- data/templates/redis2.yml +37 -0
- data/templates/sidekiq.yml +38 -0
- metadata +341 -0
data/docs/migrating.md
ADDED
@@ -0,0 +1,262 @@
|
|
1
|
+
# Steps to Migrate from Heroku to Control Plane
|
2
|
+
|
3
|
+
We recommend following along with
|
4
|
+
[this example project](https://github.com/shakacode/react-webpack-rails-tutorial).
|
5
|
+
|
6
|
+
1. [Clone the Staging Environment](#clone-the-staging-environment)
|
7
|
+
- [Review Special Gems](#review-special-gems)
|
8
|
+
- [Create a Minimum Bootable Config](#create-a-minimum-bootable-config)
|
9
|
+
2. [Create the Review App Process](#create-the-review-app-process)
|
10
|
+
- [Database for Review Apps](#database-for-review-apps)
|
11
|
+
- [Redis and Memcached for Review Apps](#redis-and-memcached-for-review-apps)
|
12
|
+
3. [Deploy to Production](#deploy-to-production)
|
13
|
+
|
14
|
+
## Clone the Staging Environment
|
15
|
+
|
16
|
+
By cloning the staging environment on Heroku, you can speed up the initial provisioning of the app on Control Plane
|
17
|
+
without compromising your current environment.
|
18
|
+
|
19
|
+
Consider migrating just the web dyno first, and get other types of dynos working afterward. You can also move the
|
20
|
+
add-ons to Control Plane later once the app works as expected.
|
21
|
+
|
22
|
+
First, create a new Heroku app with all the add-ons, copying the data from the current staging app.
|
23
|
+
|
24
|
+
Then, copy project-specific configs to a `.controlplane/` directory at the top of your project. `cpflow` will pick those up
|
25
|
+
depending on which project folder tree it runs. Thus, this automates running several projects with different configs
|
26
|
+
without explicitly switching configs.
|
27
|
+
|
28
|
+
Edit the `.controlplane/controlplane.yml` file as needed. Note that the `my-app-staging` name used in the examples below
|
29
|
+
is defined in this file. See
|
30
|
+
[this example](https://github.com/shakacode/react-webpack-rails-tutorial/blob/master/.controlplane/controlplane.yml).
|
31
|
+
|
32
|
+
Before the initial setup, add the templates for the app to the `.controlplane/controlplane.yml` file, using the `setup_app_templates`
|
33
|
+
key, e.g.:
|
34
|
+
|
35
|
+
```yaml
|
36
|
+
my-app-staging:
|
37
|
+
<<: *common
|
38
|
+
setup_app_templates:
|
39
|
+
- app
|
40
|
+
- redis
|
41
|
+
- memcached
|
42
|
+
- rails
|
43
|
+
- sidekiq
|
44
|
+
```
|
45
|
+
|
46
|
+
Note how the templates correspond to files in the `.controlplane/templates/` directory. These files will be used by the
|
47
|
+
`cpflow setup-app` and `cpflow apply-template` commands.
|
48
|
+
|
49
|
+
Ensure that env vars point to the Heroku add-ons in the template for the app (`.controlplane/templates/app.yml`). See
|
50
|
+
[this example](https://github.com/shakacode/react-webpack-rails-tutorial/blob/master/.controlplane/templates/gvc.yml).
|
51
|
+
|
52
|
+
After that, create a Dockerfile in `.controlplane/Dockerfile` for your deployment. See
|
53
|
+
[this example](https://github.com/shakacode/react-webpack-rails-tutorial/blob/master/.controlplane/Dockerfile).
|
54
|
+
|
55
|
+
You should have a folder structure similar to the following:
|
56
|
+
|
57
|
+
```sh
|
58
|
+
app_main_folder/
|
59
|
+
.controlplane/
|
60
|
+
Dockerfile # Your app's Dockerfile, with some Control Plane changes.
|
61
|
+
controlplane.yml
|
62
|
+
entrypoint.sh # App-specific - edit as needed.
|
63
|
+
templates/
|
64
|
+
app.yml
|
65
|
+
memcached.yml
|
66
|
+
rails.yml
|
67
|
+
redis.yml
|
68
|
+
sidekiq.yml
|
69
|
+
```
|
70
|
+
|
71
|
+
The example
|
72
|
+
[`.controlplane/` directory](https://github.com/shakacode/react-webpack-rails-tutorial/tree/master/.controlplane)
|
73
|
+
already contains these files.
|
74
|
+
|
75
|
+
Finally, check the app for any Heroku-specific code and update it, such as the `HEROKU_SLUG_COMMIT` env var and other
|
76
|
+
env vars beginning with `HEROKU_`. You should add some logic to check for the Control Plane equivalents - it might be
|
77
|
+
worth adding a `CONTROLPLANE` env var to act as a feature flag and help run different code for Heroku and Control Plane
|
78
|
+
until the migration is complete.
|
79
|
+
|
80
|
+
You might want to [review special gems](#review-special-gems) and
|
81
|
+
[create a minimum bootable config](#create-a-minimum-bootable-config).
|
82
|
+
|
83
|
+
At first, do the deployments from the command line. Then set up CI scripts to trigger the deployment upon merges to
|
84
|
+
master/main.
|
85
|
+
|
86
|
+
Use these commands for the initial setup and deployment:
|
87
|
+
|
88
|
+
```sh
|
89
|
+
# Provision infrastructure (one-time-only for new apps) using templates.
|
90
|
+
cpflow setup-app -a my-app-staging
|
91
|
+
|
92
|
+
# Build and push image with auto-tagging, e.g., "my-app-staging:1_456".
|
93
|
+
cpflow build-image -a my-app-staging --commit 456
|
94
|
+
|
95
|
+
# Prepare database.
|
96
|
+
cpflow run -a my-app-staging --image latest -- rails db:prepare
|
97
|
+
|
98
|
+
# Deploy latest image.
|
99
|
+
cpflow deploy-image -a my-app-staging
|
100
|
+
|
101
|
+
# Open app in browser.
|
102
|
+
cpflow open -a my-app-staging
|
103
|
+
```
|
104
|
+
|
105
|
+
Then for promoting code upgrades:
|
106
|
+
|
107
|
+
```sh
|
108
|
+
# Build and push new image with sequential tagging, e.g., "my-app-staging:2".
|
109
|
+
cpflow build-image -a my-app-staging
|
110
|
+
|
111
|
+
# Or build and push new image with sequential tagging and commit SHA, e.g., "my-app-staging:2_ABC".
|
112
|
+
cpflow build-image -a my-app-staging --commit ABC
|
113
|
+
|
114
|
+
# Run database migrations (or other release tasks) with latest image, while app is still running on previous image.
|
115
|
+
# This is analogous to the release phase.
|
116
|
+
cpflow run -a my-app-staging --image latest -- rails db:migrate
|
117
|
+
|
118
|
+
# Deploy latest image.
|
119
|
+
cpflow deploy-image -a my-app-staging
|
120
|
+
```
|
121
|
+
|
122
|
+
### Review Special Gems
|
123
|
+
|
124
|
+
Make sure to review "special" gems which might be related to Heroku, e.g.:
|
125
|
+
|
126
|
+
- `rails_autoscale_agent`. It's specific to Heroku, so it must be removed.
|
127
|
+
- `puma_worker_killer`. In general, it's unnecessary on Control Plane, as Kubernetes containers will restart on their
|
128
|
+
own logic and may not restart at all if everything is ok.
|
129
|
+
- `rack-timeout`. It could possibly be replaced with Control Plane's `timeout` option.
|
130
|
+
|
131
|
+
You can use the `CONTROLPLANE` env var to separate the gems, e.g.:
|
132
|
+
|
133
|
+
```ruby
|
134
|
+
# Gemfile
|
135
|
+
group :staging, :production do
|
136
|
+
gem "rack-timeout"
|
137
|
+
|
138
|
+
unless ENV.key?("CONTROLPLANE")
|
139
|
+
gem "rails_autoscale_agent"
|
140
|
+
gem "puma_worker_killer"
|
141
|
+
end
|
142
|
+
end
|
143
|
+
```
|
144
|
+
|
145
|
+
### Create a Minimum Bootable Config
|
146
|
+
|
147
|
+
You can try to create a minimum bootable config to migrate parts of your app gradually. To do that, follow these steps:
|
148
|
+
|
149
|
+
1. Rename the existing `application.yml` file to some other name (e.g., `application.old.yml`)
|
150
|
+
2. Create a new **minimal** `application.yml` file, e.g.:
|
151
|
+
|
152
|
+
```yaml
|
153
|
+
SECRET_KEY_BASE: "123"
|
154
|
+
# This should be enabled for `rails s`, not `rails assets:precompile`.
|
155
|
+
# DATABASE_URL: postgres://localhost:5432/dbname
|
156
|
+
# RAILS_SERVE_STATIC_FILES: "true"
|
157
|
+
|
158
|
+
# You will add whatever env vars are required here later.
|
159
|
+
```
|
160
|
+
|
161
|
+
3. Try running `RAILS_ENV=production CONTROLPLANE=true rails assets:precompile`
|
162
|
+
(theoretically, this should work without any additional env vars)
|
163
|
+
4. Fix whatever code needs to be fixed and add missing env vars
|
164
|
+
(the fewer env vars are needed, the cleaner the `Dockerfile` will be)
|
165
|
+
5. Enable `DATABASE_URL` and `RAILS_SERVE_STATIC_FILES` env vars
|
166
|
+
6. Try running `RAILS_ENV=production CONTROLPLANE=true rails s`
|
167
|
+
7. Fix whatever code needs to be fixed and add required env vars to `application.yml`
|
168
|
+
8. Try running your **production** entrypoint command, e.g.,
|
169
|
+
`RAILS_ENV=production RACK_ENV=production CONTROLPLANE=true puma -C config/puma.rb`
|
170
|
+
9. Fix whatever code needs to be fixed and add required env vars to `application.yml`
|
171
|
+
|
172
|
+
Now you should have a minimal bootable config.
|
173
|
+
|
174
|
+
Then you can temporarily set the `LOG_LEVEL=debug` env var and disable unnecessary services to help with the process,
|
175
|
+
e.g.:
|
176
|
+
|
177
|
+
```yaml
|
178
|
+
DISABLE_SPRING: "true"
|
179
|
+
SCOUT_MONITOR: "false"
|
180
|
+
RACK_TIMEOUT_SERVICE_TIMEOUT: "0"
|
181
|
+
```
|
182
|
+
|
183
|
+
## Create the Review App Process
|
184
|
+
|
185
|
+
Add an entry for review apps to the `.controlplane/controlplane.yml` file. By adding a `match_if_app_name_starts_with`
|
186
|
+
key with the value `true`, any app that starts with the entry's name will use this config. Doing this allows you to
|
187
|
+
configure an entry for, e.g., `my-app-review`, and then create review apps starting with that name (e.g.,
|
188
|
+
`my-app-review-1234`, `my-app-review-5678`, etc.). Here's an example:
|
189
|
+
|
190
|
+
```yaml
|
191
|
+
my-app-review:
|
192
|
+
<<: *common
|
193
|
+
match_if_app_name_starts_with: true
|
194
|
+
setup_app_templates:
|
195
|
+
- app
|
196
|
+
- redis
|
197
|
+
- memcached
|
198
|
+
- rails
|
199
|
+
- sidekiq
|
200
|
+
```
|
201
|
+
|
202
|
+
In your CI scripts, you can create a review app using some identifier (e.g., the number of the PR on GitHub).
|
203
|
+
|
204
|
+
```yaml
|
205
|
+
# On CircleCI, you can use `echo $CIRCLE_PULL_REQUEST | grep -Eo '[0-9]+$'` to extract the number of the PR.
|
206
|
+
PR_NUM=$(... extract the number of the PR here ...)
|
207
|
+
echo "export APP_NAME=my-app-review-$PR_NUM" >> $BASH_ENV
|
208
|
+
|
209
|
+
# Only create the app if it doesn't exist yet, as we may have multiple triggers for the review app
|
210
|
+
# (such as when a PR gets updated).
|
211
|
+
if ! cpflow exists -a ${APP_NAME}; then
|
212
|
+
cpflow setup-app -a ${APP_NAME}
|
213
|
+
echo "export NEW_APP=true" >> $BASH_ENV
|
214
|
+
fi
|
215
|
+
|
216
|
+
# The `NEW_APP` env var that we exported above can be used to either reset or migrate the database before deploying.
|
217
|
+
if [ -n "${NEW_APP}" ]; then
|
218
|
+
cpflow run -a ${APP_NAME} --image latest -- rails db:reset
|
219
|
+
else
|
220
|
+
cpflow run -a ${APP_NAME} --image latest -- rails db:migrate
|
221
|
+
fi
|
222
|
+
```
|
223
|
+
|
224
|
+
Then follow the same steps for the initial deployment or code upgrades.
|
225
|
+
|
226
|
+
### Database for Review Apps
|
227
|
+
|
228
|
+
For the review app resources, these should be handled as env vars in the template for the app
|
229
|
+
(`.controlplane/templates/app.yml`), .e.g.:
|
230
|
+
|
231
|
+
```yaml
|
232
|
+
- name: DATABASE_URL
|
233
|
+
value: postgres://postgres:XXXXXXXX@cpln-XXXX-staging.XXXXXX.us-east-1.rds.amazonaws.com:5432/APP_GVC
|
234
|
+
```
|
235
|
+
|
236
|
+
Notice that `APP_GVC` is the app name, which is used as the database name on RDS, so that each review app gets its own
|
237
|
+
database on the one RDS instance used for all review apps, which would be, e.g., `my-app-review-1234`.
|
238
|
+
|
239
|
+
### Redis and Memcached for Review Apps
|
240
|
+
|
241
|
+
So long as no persistence is needed for Redis and Memcached, we have templates for workloads that should be sufficient
|
242
|
+
for review apps in the `templates/` directory of this repository. Using these templates results in considerable cost
|
243
|
+
savings compared to paying for the resources on Heroku.
|
244
|
+
|
245
|
+
```yaml
|
246
|
+
- name: MEMCACHE_SERVERS
|
247
|
+
value: memcached.APP_GVC.cpln.local
|
248
|
+
- name: REDIS_URL
|
249
|
+
value: redis://redis.APP_GVC.cpln.local:6379
|
250
|
+
```
|
251
|
+
|
252
|
+
## Deploy to Production
|
253
|
+
|
254
|
+
Only try deploying to production once staging and review apps are working well.
|
255
|
+
|
256
|
+
For simplicity, keep add-ons running on Heroku initially. You could move over the database to RDS first. However, it's a
|
257
|
+
bit simpler to isolate any differences in cost and performance by first moving over your compute to Control Plane.
|
258
|
+
|
259
|
+
Ensure that your Control Plane compute is in the AWS region `US-EAST-1`; otherwise, you'll have noticeable extra latency
|
260
|
+
with your calls to resources. You might also have egress charges from Control Plane.
|
261
|
+
|
262
|
+
Use the `cpflow promote-app-from-upstream` command to promote the staging app to production.
|
data/docs/postgres.md
ADDED
@@ -0,0 +1,436 @@
|
|
1
|
+
# Migrating Postgres database from Heroku infrastructure
|
2
|
+
|
3
|
+
One of the biggest problems that will appear when moving from Heroku infrastructure is migrating the database. And
|
4
|
+
despite it being rather easy if done between Heroku-hosted databases or non-Heroku-hosted databases (as Postgres has
|
5
|
+
tools to do that naturally) it is not easily possible between Heroku and anything outside Heroku,
|
6
|
+
**as Heroku doesn't allow setting up WAL replication for Postgres**. Period. No... any... replication outside of
|
7
|
+
Heroku infrastructure for Postgres.
|
8
|
+
|
9
|
+
Previously, it was said to be possible to ask Heroku support to manually set up WAL log shipping, but they
|
10
|
+
don't want to do that anymore. Which leaves only 2 options:
|
11
|
+
|
12
|
+
### Option A: dump and restore way
|
13
|
+
|
14
|
+
Nothing problematic here in general **if you can withstand long application maintenance time**.
|
15
|
+
You basically need to:
|
16
|
+
|
17
|
+
1. enable maintenance
|
18
|
+
2. stop the application completely and wait for all the database writes to finish
|
19
|
+
3. dump database on Heroku
|
20
|
+
4. restore the database on RDS
|
21
|
+
5. start the application
|
22
|
+
6. disable maintenance
|
23
|
+
|
24
|
+
And if the database is small or it is a hobby app, this should not be looked any further.
|
25
|
+
However, this is not acceptable for 99% of production apps as their databases are huge and maintenance time
|
26
|
+
should be as small as possible.
|
27
|
+
|
28
|
+
Rough timing for a 1Gb database can be (but your mileage may vary):
|
29
|
+
|
30
|
+
- 2.5h creating Heroku backup
|
31
|
+
- 0.5h downloading backup to EC2
|
32
|
+
- 13h restoring a backup on RDS (in 4 threads)
|
33
|
+
|
34
|
+
**~16h total time, equals maintenance downtime**
|
35
|
+
|
36
|
+
### Option B: logical replication way
|
37
|
+
|
38
|
+
There are several logical replication solutions exist for Postgres - Slony, Bucardo, Londiste, Mimeo, but... when
|
39
|
+
you try to dig deeper, the only viable and up-to-date solution for purpose of migrating from Heroku to RDS is Bucardo.
|
40
|
+
|
41
|
+
The migration process with Bucardo looks as follows:
|
42
|
+
|
43
|
+
1. setup Bucardo on the dedicated EC2 instance
|
44
|
+
2. dump Heroku database schema and restore it on RDS - rather fast as there is no data
|
45
|
+
3. start Bucardo replication - this will install triggers and special schema to your database
|
46
|
+
4. wait for replication to catch up - this may take a long time, but the application can continue working as usual
|
47
|
+
5. enable maintenance
|
48
|
+
6. stop the application completely and wait for replication to finally finish
|
49
|
+
7. switch the database connection strings
|
50
|
+
8. start the application
|
51
|
+
9. disable maintenance
|
52
|
+
|
53
|
+
Maintenance downtime here can be minutes not hours or days like in p1, but no free lunches - the process is more complex.
|
54
|
+
|
55
|
+
Rough timing for a 1Gb database can be (but your mileage may vary):
|
56
|
+
|
57
|
+
- whatever setup time, no hurry
|
58
|
+
- 1.5 days for onetimecopy (in 1 thread) - DDL changes not allowed, but no downtime
|
59
|
+
- 1-2 min for database switch, maintenance downtime
|
60
|
+
|
61
|
+
**~2 days total time, ~1-2 min maintenance downtime**
|
62
|
+
|
63
|
+
### Some considerations:
|
64
|
+
|
65
|
+
- DDL changes should be "frozen and postponed" while Bucardo replication. There is also a way to stop replication,
|
66
|
+
update DDL in both databases, and restart replication, however, as well no-DDL for a day or two seems a
|
67
|
+
reasonable restriction for production databases vs potential errors.
|
68
|
+
|
69
|
+
- there is a "speed up" option to restore dump (with threads) and then run Bucardo to catch up only deltas, but it
|
70
|
+
looks unnecessary as speed gain is minimal vs potential errors. It will not speed up things dramatically, but will just
|
71
|
+
save a couple of hours of non-maintenance time (which most probably be spent on the command line) so not worth doing.
|
72
|
+
|
73
|
+
## Before replication
|
74
|
+
|
75
|
+
### Application code changes
|
76
|
+
|
77
|
+
Before everything, we need to recheck the database schema and ensure **that every table has a primary key (PK)
|
78
|
+
in place** as Bucardo is using PKs for replication.
|
79
|
+
|
80
|
+
> NOTE: theoretically Bucardo can work with uniq indexes as well, but having a PK on each table is easy and avoids
|
81
|
+
unnesessary complications
|
82
|
+
|
83
|
+
So, please stop, and do whatever is needed for your application.
|
84
|
+
|
85
|
+
### Choosing database location and EC2 location
|
86
|
+
|
87
|
+
All Heroku Postgres databases for location US are running in AWS on `us-east-1`. Control Plane, on the other side,
|
88
|
+
recommends `us-east-2` as a default. So we need to choose:
|
89
|
+
|
90
|
+
- either simple setup - main database in `us-east-2`
|
91
|
+
- or a bit more complex - main database in `us-east-2`, replica in `us-east-1` (which can be removed later)
|
92
|
+
|
93
|
+
This makes sense if your application supports working with replicas. Then read-only `SELECT` queries will go to the
|
94
|
+
read replica and write `INSERT/UPDATE` queries will go to the main write-enabled database.
|
95
|
+
This way we can keep most reading latency to the minimum.
|
96
|
+
|
97
|
+
Anyway, it is worth to consider developing such a mode in the application if you want to scale in more than 1 region.
|
98
|
+
|
99
|
+
### Create new EC2 instance which we will use for database replication
|
100
|
+
|
101
|
+
- better if it will be in the same AWS location where RDS database will be (most probably `us-east-2`)
|
102
|
+
- choose Ubuntu as OS
|
103
|
+
- use some bigger instance, e.g. `m6i.4xlarge` - price doesn't matter much, as such instance will not run long time
|
104
|
+
- if you will be copying backup via this instance, choose sufficient space for both OS and backup and some free space
|
105
|
+
- create security group `public-access` with all inbound and outbound traffic allowed. this will be handy as well for
|
106
|
+
database setup. if you need tighter access controls, up to you
|
107
|
+
- generate a new certificate and save it locally (e.g. `bucardo.pem`), will be used for SSH connection. Do not forget to
|
108
|
+
update correct permissions e.g. `chown TODO ~/Dowloads/bucardo.pem`
|
109
|
+
|
110
|
+
After the instance will be running on AWS, you can connect to it via SSH as follows:
|
111
|
+
```sh
|
112
|
+
ssh ubuntu@1.2.3.4 -i ~/Downloads/bucardo.pem
|
113
|
+
```
|
114
|
+
|
115
|
+
### Creating RDS instance
|
116
|
+
|
117
|
+
- check `public` box
|
118
|
+
- pick `public-access` security group (or whatever you need)
|
119
|
+
- if you will be restoring from backup, it is possible to choose temporary bigger instance e.g. `db.r6i.4xlarge`, which
|
120
|
+
can be downgraded later.
|
121
|
+
- if you will be using bucardo onetimecopy, then it is ok to select any instance you may need, as bucardo does copying
|
122
|
+
in a single thread
|
123
|
+
- it is fairly easy to switch database instance type afterwards and requires only minimal downtime
|
124
|
+
- storage space. this needs a good pick, as it is a) not possible to shrink and b) auto-expanding which AWS offers and
|
125
|
+
which should be enabled by default can block database modifications for quite long periods (days)
|
126
|
+
|
127
|
+
### Running commands in detached mode on the EC2 instance
|
128
|
+
|
129
|
+
Some commands that run on EC2 may take a long time and we may want to disconnect from the SSH session while the command
|
130
|
+
will continue running. And we want to reconnect to the session and see the progress. Possibly without installing
|
131
|
+
special tools. This can be accomplished with `screen` command, e.g.:
|
132
|
+
|
133
|
+
```sh
|
134
|
+
# this will start a backround process and return to terminal (which can be closed)
|
135
|
+
screen -dmL ...your command...
|
136
|
+
|
137
|
+
# checking if screen is still running in the background
|
138
|
+
ps aux | grep -i screen
|
139
|
+
|
140
|
+
# see the output log
|
141
|
+
cat screenlog.0
|
142
|
+
```
|
143
|
+
|
144
|
+
### Installing Postgres and Bucardo on EC2
|
145
|
+
|
146
|
+
Now, when RDS is running and EC2 is running we can start installing local Postgres and Bucardo itself. Let's install
|
147
|
+
Postgres 13 first. It may be possible to install the latest Postgres, but 13 seems the best choice atm.
|
148
|
+
|
149
|
+
```sh
|
150
|
+
# update all your packages
|
151
|
+
sudo apt update
|
152
|
+
sudo apt upgrade -y
|
153
|
+
|
154
|
+
# add postgres repository key
|
155
|
+
sudo sh -c 'echo "deb [arch=$(dpkg --print-architecture)] http://apt.postgresql.org/pub/repos/apt $(lsb_release -cs)-pgdg main" > /etc/apt/sources.list.d/pgdg.list'
|
156
|
+
|
157
|
+
wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | sudo apt-key add -
|
158
|
+
|
159
|
+
# update again
|
160
|
+
sudo apt-get update
|
161
|
+
|
162
|
+
# install packages
|
163
|
+
sudo apt-get -y install make postgresql-13 postgresql-client-13 postgresql-plperl-13 libdbd-pg-perl libdbix-safe-perl
|
164
|
+
```
|
165
|
+
|
166
|
+
Postgres Perl language `plperl` as well as DBD and DBIx packages are needed for Bucardo.
|
167
|
+
|
168
|
+
Now, as all dependencies are installed, we can install Bucardo from latest tarball.
|
169
|
+
|
170
|
+
```sh
|
171
|
+
# install Bucardo itself
|
172
|
+
wget https://bucardo.org/downloads/Bucardo-5.6.0.tar.gz
|
173
|
+
tar xzf Bucardo-5.6.0.tar.gz
|
174
|
+
cd Bucardo-5.6.0
|
175
|
+
perl Makefile.PL
|
176
|
+
sudo make install
|
177
|
+
|
178
|
+
# create dirs and fix permissions
|
179
|
+
sudo mkdir /var/run/bucardo
|
180
|
+
sudo mkdir /var/log/bucardo
|
181
|
+
sudo chown ubuntu /var/run/bucardo/
|
182
|
+
sudo chown ubuntu /var/log/bucardo
|
183
|
+
```
|
184
|
+
|
185
|
+
After that, Bucardo is physically installed as a package and runnable but we need to configure everything.
|
186
|
+
Let's start with Postgres. As this is a temporary installation (only for the period of replication),
|
187
|
+
it is rather safe to set `trust` localhost connections (or set up another way if you want this).
|
188
|
+
|
189
|
+
|
190
|
+
For this, we need to edit `pg_hba.conf` as follows:
|
191
|
+
```sh
|
192
|
+
# edit pg config to make postgres trusted
|
193
|
+
sudo nano /etc/postgresql/13/main/pg_hba.conf
|
194
|
+
```
|
195
|
+
|
196
|
+
in that file change the following lines to `trust`
|
197
|
+
```sh
|
198
|
+
# in pg_hba.conf
|
199
|
+
local all postgres trust
|
200
|
+
local all all trust
|
201
|
+
```
|
202
|
+
|
203
|
+
and restart Postgres to pick up changes
|
204
|
+
```sh
|
205
|
+
# restart postgres
|
206
|
+
sudo systemctl restart postgresql
|
207
|
+
```
|
208
|
+
|
209
|
+
And finally-finally we can install Bucardo service database on local Postgres and see if everything runs.
|
210
|
+
|
211
|
+
```sh
|
212
|
+
# this will create local bucardo "service" database
|
213
|
+
bucardo install
|
214
|
+
|
215
|
+
# for option 3 pick `postgres` as a user
|
216
|
+
# for option 4 pick `postgres` as a database from where initial connection should be attempted
|
217
|
+
```
|
218
|
+
|
219
|
+
:tada: :tada: :tada: now we have local Postgres and Bucardo running, and can continue with external services
|
220
|
+
configuration.
|
221
|
+
|
222
|
+
### Configuring external (Heroku, RDS) database connections
|
223
|
+
|
224
|
+
For this, we will use `pg_service.conf`. It will not work in all places, and sometimes we will need to provide
|
225
|
+
connection properties manually, but for many commands, it is very useful.
|
226
|
+
|
227
|
+
```sh
|
228
|
+
# create and edit .pg_service.conf
|
229
|
+
touch ~/.pg_service.conf
|
230
|
+
nano ~/.pg_service.conf
|
231
|
+
```
|
232
|
+
|
233
|
+
```ini
|
234
|
+
# ~/.pg_service.conf
|
235
|
+
|
236
|
+
[heroku]
|
237
|
+
host=ec2-xxx.compute-1.amazonaws.com
|
238
|
+
port=5432
|
239
|
+
dbname=xxx
|
240
|
+
user=xxx
|
241
|
+
password=xxx
|
242
|
+
|
243
|
+
[rds]
|
244
|
+
host=xxx.us-east-2.rds.amazonaws.com
|
245
|
+
port=5432
|
246
|
+
dbname=xxx
|
247
|
+
user=postgres
|
248
|
+
password=xxx
|
249
|
+
```
|
250
|
+
|
251
|
+
Test connectivity to databases with:
|
252
|
+
```sh
|
253
|
+
psql service=heroku -c '\l+'
|
254
|
+
psql service=rds -c '\l+'
|
255
|
+
```
|
256
|
+
|
257
|
+
You will see all databases set up on each server (and can see their size):
|
258
|
+
```console
|
259
|
+
Name | Owner | Encoding | Collate | Ctype | Access privileges | Size | Tablespace | Description
|
260
|
+
-------------------+----------+----------+-------------+-------------+-----------------------+-----------+------------+--------------------------------------------
|
261
|
+
my-production-db | xxx | UTF8 | en_US.UTF-8 | en_US.UTF-8 | | 821 GB | pg_default |
|
262
|
+
postgres | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | | 8205 kB | pg_default | default administrative connection database
|
263
|
+
rdsadmin | rdsadmin | UTF8 | en_US.UTF-8 | en_US.UTF-8 | rdsadmin=CTc/rdsadmin+| No Access | pg_default |
|
264
|
+
| | | | | rdstopmgr=Tc/rdsadmin | | |
|
265
|
+
template0 | rdsadmin | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/rdsadmin +| 8033 kB | pg_default | unmodifiable empty database
|
266
|
+
| | | | | rdsadmin=CTc/rdsadmin | | |
|
267
|
+
template1 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/postgres +| 8205 kB | pg_default | default template for new databases
|
268
|
+
| | | | | postgres=CTc/postgres | | |
|
269
|
+
(5 rows)
|
270
|
+
```
|
271
|
+
|
272
|
+
After both databases are connectable, we can proceed with replication itself.
|
273
|
+
|
274
|
+
## Performing replication
|
275
|
+
|
276
|
+
:fire: :fire: :fire: **IMPORTANT: from this step, DDL changes are not allowed** :fire: :fire: :fire:
|
277
|
+
|
278
|
+
### Application changes
|
279
|
+
|
280
|
+
- temporary freeze all DDL changes till replication will finish
|
281
|
+
- temporary disable all background jobs or services that are possible to stop. This will help to lessen database
|
282
|
+
load and especially, where possible, to decrease database write operations that will decrease replication pipe as well.
|
283
|
+
|
284
|
+
### Dump and restore the initial schema
|
285
|
+
|
286
|
+
This step doesn't take much time (as it is only a database schema without data),
|
287
|
+
but it is definitely handy to save all output, and closely check for any errors.
|
288
|
+
|
289
|
+
```sh
|
290
|
+
# save heroku schema to `schema.sql`
|
291
|
+
pg_dump service=heroku --schema-only --no-acl --no-owner -v > schema.sql
|
292
|
+
|
293
|
+
# restore `schema.sql` on RDS
|
294
|
+
psql service=rds < schema.sql
|
295
|
+
```
|
296
|
+
|
297
|
+
### Configure Bucardo replication
|
298
|
+
|
299
|
+
After we have all databases connectable and all same schema, we can tell Bucardo what it needs to replicate:
|
300
|
+
```sh
|
301
|
+
# add databases
|
302
|
+
bucardo add db from_db dbhost=xxx dbport=5432 dbuser=xxx dbpass=xxx dbname=xxx
|
303
|
+
bucardo add db to_db dbhost=xxx dbport=5432 dbuser=postgres dbpass=xxx dbname=xxx
|
304
|
+
|
305
|
+
# mark all tables and sequences for replication
|
306
|
+
bucardo add all tables
|
307
|
+
bucardo add all sequences
|
308
|
+
```
|
309
|
+
Here, Bucardo will connect to databases and collect object metadata for replication.
|
310
|
+
After that, we can add sync as well:
|
311
|
+
|
312
|
+
```sh
|
313
|
+
# add sync itself
|
314
|
+
bucardo add sync mysync tables=all dbs=from_db,to_db onetimecopy=1
|
315
|
+
```
|
316
|
+
|
317
|
+
The most important option here is `onetimecopy=1` which will tell Bucardo to perform initial data copying
|
318
|
+
(when sync will start). Such a copying is done *in a single thread* by creating a pipe (via Bucardo) as follows:
|
319
|
+
```SQL
|
320
|
+
-- on heroku
|
321
|
+
COPY xxx TO STDOUT
|
322
|
+
-- on rds
|
323
|
+
COPY xxx FROM STDIN
|
324
|
+
```
|
325
|
+
|
326
|
+
### Run sync
|
327
|
+
|
328
|
+
And now, when everything is ready, we can push the button and go for a long :coffee: or maybe even a weekend.
|
329
|
+
|
330
|
+
```sh
|
331
|
+
# starts Bucardo sync daemon
|
332
|
+
bucardo start
|
333
|
+
```
|
334
|
+
|
335
|
+
As well it is ok to disconnect from SSH as Bucardo daemon will continue working in background.
|
336
|
+
|
337
|
+
### Monitor status
|
338
|
+
|
339
|
+
To check the progress of sync (from Bucardo perspective):
|
340
|
+
```sh
|
341
|
+
# overall progress of all syncs
|
342
|
+
bucardo status
|
343
|
+
|
344
|
+
# single sync progress
|
345
|
+
bucardo status mysync
|
346
|
+
```
|
347
|
+
|
348
|
+
To check what's going on in databases directly:
|
349
|
+
```sh
|
350
|
+
# Bucardo adds a comment to it's queries, so it is fairly easy to grep those
|
351
|
+
psql service=heroku -c 'select * from pg_stat_activity' | grep -i bucardo
|
352
|
+
psql service=rds -c 'select * from pg_stat_activity' | grep -i bucardo
|
353
|
+
```
|
354
|
+
|
355
|
+
### After replication will catch up, but before databases switch
|
356
|
+
|
357
|
+
1. Please do a sanity check of data in tables. E.g. check:
|
358
|
+
|
359
|
+
- table `COUNT`
|
360
|
+
- min/max of PK ids where applicable
|
361
|
+
- min/max of `created_at/updated_at` where applicable
|
362
|
+
|
363
|
+
2. For p1 it is possible to use our checker script that will do this automatically (TODO)
|
364
|
+
|
365
|
+
3. Refresh materialized views manually (as they are not synced by Bucardo).
|
366
|
+
Just go to `psql` and `REFRESH MATERIALIZED VIEW ...`
|
367
|
+
|
368
|
+
## Switch databases
|
369
|
+
|
370
|
+
:fire: :fire: :fire: **This is final non reverisble step now** :fire: :fire: :fire:
|
371
|
+
Before this point, all changes can be easily removed or reversed and database can stay on Heroku as it was before,
|
372
|
+
afther this switch it is not possible (at least easily).
|
373
|
+
|
374
|
+
So... after sync will catch up, basically it is needed to:
|
375
|
+
|
376
|
+
1. start maintenance mode on heroku `heroku maintenance:on`
|
377
|
+
2. scale down and stop all the dynos
|
378
|
+
3. wait a bit for all queries to finish and replication catch up latest changes
|
379
|
+
4. detach heroku postgres from DATABASE_URL
|
380
|
+
5. set `DATABASE_URL` to RDS url (plaintext now)
|
381
|
+
6. start dynos
|
382
|
+
7. wait for their readiness with `heroku ps:wait`
|
383
|
+
8. stop maintenance with `heroku maintenance:off`
|
384
|
+
9. :fire: **Now we are fully on RDS, so DDL changes are allowed** :fire:
|
385
|
+
10. you could gradually enable all background jobs and services which were temporary stopped
|
386
|
+
|
387
|
+
## After switch
|
388
|
+
|
389
|
+
As we now running on RDS, there is only single task left to do on Heroku - make a final backup of database and save it.
|
390
|
+
|
391
|
+
```sh
|
392
|
+
# to capture backup (will take lots of time), can be disconnected
|
393
|
+
heroku pg:backups:capture -a example-app
|
394
|
+
|
395
|
+
# to get url of backup
|
396
|
+
heroku pg:backups:url bXXXX -a example-app
|
397
|
+
```
|
398
|
+
|
399
|
+
Now you can download it locally or copy to S3 via EC2 as it will take quite some time and traffic.
|
400
|
+
|
401
|
+
```sh
|
402
|
+
# download dump to EC2
|
403
|
+
screen -dmL time curl 'your-url' -o latest.dump
|
404
|
+
|
405
|
+
# install aws cli (in a way reccomended by Amazon)
|
406
|
+
# ...TODO...
|
407
|
+
|
408
|
+
# configure aws credentials
|
409
|
+
aws configure
|
410
|
+
|
411
|
+
# check S3 access
|
412
|
+
aws s3 ls
|
413
|
+
|
414
|
+
# upload to S3
|
415
|
+
screen -dmL time aws s3 cp latest.dump s3://my-dumps-bucket/ --region us-east-1
|
416
|
+
```
|
417
|
+
|
418
|
+
# Refs
|
419
|
+
|
420
|
+
https://bucardo.org
|
421
|
+
|
422
|
+
https://stackoverflow.com/questions/22264753/linux-how-to-install-dbdpg-module
|
423
|
+
|
424
|
+
https://gist.github.com/luizomf/1a7994cf4263e10dce416a75b9180f01
|
425
|
+
|
426
|
+
https://www.waytoeasylearn.com/learn/bucardo-installation/
|
427
|
+
|
428
|
+
https://gist.github.com/knasyrov/97301801733a31c60521
|
429
|
+
|
430
|
+
https://www.cloudnation.nl/inspiratie/blogs/migrating-heroku-postgresql-to-aurora-rds-with-almost-minimal-downtime
|
431
|
+
|
432
|
+
https://blog.porter.run/migrating-postgres-from-heroku-to-rds/
|
433
|
+
|
434
|
+
https://www.endpointdev.com/blog/2017/06/amazon-aws-upgrades-to-postgres-with/
|
435
|
+
|
436
|
+
https://aws.amazon.com/blogs/database/migrating-legacy-postgresql-databases-to-amazon-rds-or-aurora-postgresql-using-bucardo/
|