cpflow 3.0.0

Sign up to get free protection for your applications and to get access to all the features.
Files changed (100) hide show
  1. checksums.yaml +7 -0
  2. data/.github/workflows/check_cpln_links.yml +19 -0
  3. data/.github/workflows/command_docs.yml +24 -0
  4. data/.github/workflows/rspec-shared.yml +56 -0
  5. data/.github/workflows/rspec.yml +28 -0
  6. data/.github/workflows/rubocop.yml +24 -0
  7. data/.gitignore +18 -0
  8. data/.overcommit.yml +16 -0
  9. data/.rubocop.yml +22 -0
  10. data/.simplecov_spawn.rb +10 -0
  11. data/CHANGELOG.md +259 -0
  12. data/CONTRIBUTING.md +73 -0
  13. data/Gemfile +7 -0
  14. data/Gemfile.lock +126 -0
  15. data/LICENSE +21 -0
  16. data/README.md +546 -0
  17. data/Rakefile +21 -0
  18. data/bin/cpflow +6 -0
  19. data/cpflow +6 -0
  20. data/cpflow.gemspec +41 -0
  21. data/docs/assets/grafana-alert.png +0 -0
  22. data/docs/assets/memcached.png +0 -0
  23. data/docs/assets/sidekiq-pre-stop-hook.png +0 -0
  24. data/docs/commands.md +454 -0
  25. data/docs/dns.md +15 -0
  26. data/docs/migrating.md +262 -0
  27. data/docs/postgres.md +436 -0
  28. data/docs/redis.md +128 -0
  29. data/docs/secrets-and-env-values.md +42 -0
  30. data/docs/tips.md +150 -0
  31. data/docs/troubleshooting.md +6 -0
  32. data/examples/circleci.yml +104 -0
  33. data/examples/controlplane.yml +159 -0
  34. data/lib/command/apply_template.rb +209 -0
  35. data/lib/command/base.rb +540 -0
  36. data/lib/command/build_image.rb +49 -0
  37. data/lib/command/cleanup_images.rb +136 -0
  38. data/lib/command/cleanup_stale_apps.rb +79 -0
  39. data/lib/command/config.rb +48 -0
  40. data/lib/command/copy_image_from_upstream.rb +108 -0
  41. data/lib/command/delete.rb +149 -0
  42. data/lib/command/deploy_image.rb +56 -0
  43. data/lib/command/doctor.rb +47 -0
  44. data/lib/command/env.rb +22 -0
  45. data/lib/command/exists.rb +23 -0
  46. data/lib/command/generate.rb +45 -0
  47. data/lib/command/info.rb +222 -0
  48. data/lib/command/latest_image.rb +19 -0
  49. data/lib/command/logs.rb +49 -0
  50. data/lib/command/maintenance.rb +42 -0
  51. data/lib/command/maintenance_off.rb +62 -0
  52. data/lib/command/maintenance_on.rb +62 -0
  53. data/lib/command/maintenance_set_page.rb +34 -0
  54. data/lib/command/no_command.rb +23 -0
  55. data/lib/command/open.rb +33 -0
  56. data/lib/command/open_console.rb +26 -0
  57. data/lib/command/promote_app_from_upstream.rb +38 -0
  58. data/lib/command/ps.rb +41 -0
  59. data/lib/command/ps_restart.rb +37 -0
  60. data/lib/command/ps_start.rb +51 -0
  61. data/lib/command/ps_stop.rb +82 -0
  62. data/lib/command/ps_wait.rb +40 -0
  63. data/lib/command/run.rb +573 -0
  64. data/lib/command/setup_app.rb +113 -0
  65. data/lib/command/test.rb +23 -0
  66. data/lib/command/version.rb +18 -0
  67. data/lib/constants/exit_code.rb +7 -0
  68. data/lib/core/config.rb +316 -0
  69. data/lib/core/controlplane.rb +552 -0
  70. data/lib/core/controlplane_api.rb +170 -0
  71. data/lib/core/controlplane_api_direct.rb +112 -0
  72. data/lib/core/doctor_service.rb +104 -0
  73. data/lib/core/helpers.rb +26 -0
  74. data/lib/core/shell.rb +100 -0
  75. data/lib/core/template_parser.rb +76 -0
  76. data/lib/cpflow/version.rb +6 -0
  77. data/lib/cpflow.rb +288 -0
  78. data/lib/deprecated_commands.json +9 -0
  79. data/lib/generator_templates/Dockerfile +27 -0
  80. data/lib/generator_templates/controlplane.yml +62 -0
  81. data/lib/generator_templates/entrypoint.sh +8 -0
  82. data/lib/generator_templates/templates/app.yml +21 -0
  83. data/lib/generator_templates/templates/postgres.yml +176 -0
  84. data/lib/generator_templates/templates/rails.yml +36 -0
  85. data/rakelib/create_release.rake +81 -0
  86. data/script/add_command +37 -0
  87. data/script/check_command_docs +3 -0
  88. data/script/check_cpln_links +45 -0
  89. data/script/rename_command +43 -0
  90. data/script/update_command_docs +62 -0
  91. data/templates/app.yml +13 -0
  92. data/templates/daily-task.yml +32 -0
  93. data/templates/maintenance.yml +25 -0
  94. data/templates/memcached.yml +24 -0
  95. data/templates/postgres.yml +32 -0
  96. data/templates/rails.yml +27 -0
  97. data/templates/redis.yml +21 -0
  98. data/templates/redis2.yml +37 -0
  99. data/templates/sidekiq.yml +38 -0
  100. metadata +341 -0
data/docs/migrating.md ADDED
@@ -0,0 +1,262 @@
1
+ # Steps to Migrate from Heroku to Control Plane
2
+
3
+ We recommend following along with
4
+ [this example project](https://github.com/shakacode/react-webpack-rails-tutorial).
5
+
6
+ 1. [Clone the Staging Environment](#clone-the-staging-environment)
7
+ - [Review Special Gems](#review-special-gems)
8
+ - [Create a Minimum Bootable Config](#create-a-minimum-bootable-config)
9
+ 2. [Create the Review App Process](#create-the-review-app-process)
10
+ - [Database for Review Apps](#database-for-review-apps)
11
+ - [Redis and Memcached for Review Apps](#redis-and-memcached-for-review-apps)
12
+ 3. [Deploy to Production](#deploy-to-production)
13
+
14
+ ## Clone the Staging Environment
15
+
16
+ By cloning the staging environment on Heroku, you can speed up the initial provisioning of the app on Control Plane
17
+ without compromising your current environment.
18
+
19
+ Consider migrating just the web dyno first, and get other types of dynos working afterward. You can also move the
20
+ add-ons to Control Plane later once the app works as expected.
21
+
22
+ First, create a new Heroku app with all the add-ons, copying the data from the current staging app.
23
+
24
+ Then, copy project-specific configs to a `.controlplane/` directory at the top of your project. `cpflow` will pick those up
25
+ depending on which project folder tree it runs. Thus, this automates running several projects with different configs
26
+ without explicitly switching configs.
27
+
28
+ Edit the `.controlplane/controlplane.yml` file as needed. Note that the `my-app-staging` name used in the examples below
29
+ is defined in this file. See
30
+ [this example](https://github.com/shakacode/react-webpack-rails-tutorial/blob/master/.controlplane/controlplane.yml).
31
+
32
+ Before the initial setup, add the templates for the app to the `.controlplane/controlplane.yml` file, using the `setup_app_templates`
33
+ key, e.g.:
34
+
35
+ ```yaml
36
+ my-app-staging:
37
+ <<: *common
38
+ setup_app_templates:
39
+ - app
40
+ - redis
41
+ - memcached
42
+ - rails
43
+ - sidekiq
44
+ ```
45
+
46
+ Note how the templates correspond to files in the `.controlplane/templates/` directory. These files will be used by the
47
+ `cpflow setup-app` and `cpflow apply-template` commands.
48
+
49
+ Ensure that env vars point to the Heroku add-ons in the template for the app (`.controlplane/templates/app.yml`). See
50
+ [this example](https://github.com/shakacode/react-webpack-rails-tutorial/blob/master/.controlplane/templates/gvc.yml).
51
+
52
+ After that, create a Dockerfile in `.controlplane/Dockerfile` for your deployment. See
53
+ [this example](https://github.com/shakacode/react-webpack-rails-tutorial/blob/master/.controlplane/Dockerfile).
54
+
55
+ You should have a folder structure similar to the following:
56
+
57
+ ```sh
58
+ app_main_folder/
59
+ .controlplane/
60
+ Dockerfile # Your app's Dockerfile, with some Control Plane changes.
61
+ controlplane.yml
62
+ entrypoint.sh # App-specific - edit as needed.
63
+ templates/
64
+ app.yml
65
+ memcached.yml
66
+ rails.yml
67
+ redis.yml
68
+ sidekiq.yml
69
+ ```
70
+
71
+ The example
72
+ [`.controlplane/` directory](https://github.com/shakacode/react-webpack-rails-tutorial/tree/master/.controlplane)
73
+ already contains these files.
74
+
75
+ Finally, check the app for any Heroku-specific code and update it, such as the `HEROKU_SLUG_COMMIT` env var and other
76
+ env vars beginning with `HEROKU_`. You should add some logic to check for the Control Plane equivalents - it might be
77
+ worth adding a `CONTROLPLANE` env var to act as a feature flag and help run different code for Heroku and Control Plane
78
+ until the migration is complete.
79
+
80
+ You might want to [review special gems](#review-special-gems) and
81
+ [create a minimum bootable config](#create-a-minimum-bootable-config).
82
+
83
+ At first, do the deployments from the command line. Then set up CI scripts to trigger the deployment upon merges to
84
+ master/main.
85
+
86
+ Use these commands for the initial setup and deployment:
87
+
88
+ ```sh
89
+ # Provision infrastructure (one-time-only for new apps) using templates.
90
+ cpflow setup-app -a my-app-staging
91
+
92
+ # Build and push image with auto-tagging, e.g., "my-app-staging:1_456".
93
+ cpflow build-image -a my-app-staging --commit 456
94
+
95
+ # Prepare database.
96
+ cpflow run -a my-app-staging --image latest -- rails db:prepare
97
+
98
+ # Deploy latest image.
99
+ cpflow deploy-image -a my-app-staging
100
+
101
+ # Open app in browser.
102
+ cpflow open -a my-app-staging
103
+ ```
104
+
105
+ Then for promoting code upgrades:
106
+
107
+ ```sh
108
+ # Build and push new image with sequential tagging, e.g., "my-app-staging:2".
109
+ cpflow build-image -a my-app-staging
110
+
111
+ # Or build and push new image with sequential tagging and commit SHA, e.g., "my-app-staging:2_ABC".
112
+ cpflow build-image -a my-app-staging --commit ABC
113
+
114
+ # Run database migrations (or other release tasks) with latest image, while app is still running on previous image.
115
+ # This is analogous to the release phase.
116
+ cpflow run -a my-app-staging --image latest -- rails db:migrate
117
+
118
+ # Deploy latest image.
119
+ cpflow deploy-image -a my-app-staging
120
+ ```
121
+
122
+ ### Review Special Gems
123
+
124
+ Make sure to review "special" gems which might be related to Heroku, e.g.:
125
+
126
+ - `rails_autoscale_agent`. It's specific to Heroku, so it must be removed.
127
+ - `puma_worker_killer`. In general, it's unnecessary on Control Plane, as Kubernetes containers will restart on their
128
+ own logic and may not restart at all if everything is ok.
129
+ - `rack-timeout`. It could possibly be replaced with Control Plane's `timeout` option.
130
+
131
+ You can use the `CONTROLPLANE` env var to separate the gems, e.g.:
132
+
133
+ ```ruby
134
+ # Gemfile
135
+ group :staging, :production do
136
+ gem "rack-timeout"
137
+
138
+ unless ENV.key?("CONTROLPLANE")
139
+ gem "rails_autoscale_agent"
140
+ gem "puma_worker_killer"
141
+ end
142
+ end
143
+ ```
144
+
145
+ ### Create a Minimum Bootable Config
146
+
147
+ You can try to create a minimum bootable config to migrate parts of your app gradually. To do that, follow these steps:
148
+
149
+ 1. Rename the existing `application.yml` file to some other name (e.g., `application.old.yml`)
150
+ 2. Create a new **minimal** `application.yml` file, e.g.:
151
+
152
+ ```yaml
153
+ SECRET_KEY_BASE: "123"
154
+ # This should be enabled for `rails s`, not `rails assets:precompile`.
155
+ # DATABASE_URL: postgres://localhost:5432/dbname
156
+ # RAILS_SERVE_STATIC_FILES: "true"
157
+
158
+ # You will add whatever env vars are required here later.
159
+ ```
160
+
161
+ 3. Try running `RAILS_ENV=production CONTROLPLANE=true rails assets:precompile`
162
+ (theoretically, this should work without any additional env vars)
163
+ 4. Fix whatever code needs to be fixed and add missing env vars
164
+ (the fewer env vars are needed, the cleaner the `Dockerfile` will be)
165
+ 5. Enable `DATABASE_URL` and `RAILS_SERVE_STATIC_FILES` env vars
166
+ 6. Try running `RAILS_ENV=production CONTROLPLANE=true rails s`
167
+ 7. Fix whatever code needs to be fixed and add required env vars to `application.yml`
168
+ 8. Try running your **production** entrypoint command, e.g.,
169
+ `RAILS_ENV=production RACK_ENV=production CONTROLPLANE=true puma -C config/puma.rb`
170
+ 9. Fix whatever code needs to be fixed and add required env vars to `application.yml`
171
+
172
+ Now you should have a minimal bootable config.
173
+
174
+ Then you can temporarily set the `LOG_LEVEL=debug` env var and disable unnecessary services to help with the process,
175
+ e.g.:
176
+
177
+ ```yaml
178
+ DISABLE_SPRING: "true"
179
+ SCOUT_MONITOR: "false"
180
+ RACK_TIMEOUT_SERVICE_TIMEOUT: "0"
181
+ ```
182
+
183
+ ## Create the Review App Process
184
+
185
+ Add an entry for review apps to the `.controlplane/controlplane.yml` file. By adding a `match_if_app_name_starts_with`
186
+ key with the value `true`, any app that starts with the entry's name will use this config. Doing this allows you to
187
+ configure an entry for, e.g., `my-app-review`, and then create review apps starting with that name (e.g.,
188
+ `my-app-review-1234`, `my-app-review-5678`, etc.). Here's an example:
189
+
190
+ ```yaml
191
+ my-app-review:
192
+ <<: *common
193
+ match_if_app_name_starts_with: true
194
+ setup_app_templates:
195
+ - app
196
+ - redis
197
+ - memcached
198
+ - rails
199
+ - sidekiq
200
+ ```
201
+
202
+ In your CI scripts, you can create a review app using some identifier (e.g., the number of the PR on GitHub).
203
+
204
+ ```yaml
205
+ # On CircleCI, you can use `echo $CIRCLE_PULL_REQUEST | grep -Eo '[0-9]+$'` to extract the number of the PR.
206
+ PR_NUM=$(... extract the number of the PR here ...)
207
+ echo "export APP_NAME=my-app-review-$PR_NUM" >> $BASH_ENV
208
+
209
+ # Only create the app if it doesn't exist yet, as we may have multiple triggers for the review app
210
+ # (such as when a PR gets updated).
211
+ if ! cpflow exists -a ${APP_NAME}; then
212
+ cpflow setup-app -a ${APP_NAME}
213
+ echo "export NEW_APP=true" >> $BASH_ENV
214
+ fi
215
+
216
+ # The `NEW_APP` env var that we exported above can be used to either reset or migrate the database before deploying.
217
+ if [ -n "${NEW_APP}" ]; then
218
+ cpflow run -a ${APP_NAME} --image latest -- rails db:reset
219
+ else
220
+ cpflow run -a ${APP_NAME} --image latest -- rails db:migrate
221
+ fi
222
+ ```
223
+
224
+ Then follow the same steps for the initial deployment or code upgrades.
225
+
226
+ ### Database for Review Apps
227
+
228
+ For the review app resources, these should be handled as env vars in the template for the app
229
+ (`.controlplane/templates/app.yml`), .e.g.:
230
+
231
+ ```yaml
232
+ - name: DATABASE_URL
233
+ value: postgres://postgres:XXXXXXXX@cpln-XXXX-staging.XXXXXX.us-east-1.rds.amazonaws.com:5432/APP_GVC
234
+ ```
235
+
236
+ Notice that `APP_GVC` is the app name, which is used as the database name on RDS, so that each review app gets its own
237
+ database on the one RDS instance used for all review apps, which would be, e.g., `my-app-review-1234`.
238
+
239
+ ### Redis and Memcached for Review Apps
240
+
241
+ So long as no persistence is needed for Redis and Memcached, we have templates for workloads that should be sufficient
242
+ for review apps in the `templates/` directory of this repository. Using these templates results in considerable cost
243
+ savings compared to paying for the resources on Heroku.
244
+
245
+ ```yaml
246
+ - name: MEMCACHE_SERVERS
247
+ value: memcached.APP_GVC.cpln.local
248
+ - name: REDIS_URL
249
+ value: redis://redis.APP_GVC.cpln.local:6379
250
+ ```
251
+
252
+ ## Deploy to Production
253
+
254
+ Only try deploying to production once staging and review apps are working well.
255
+
256
+ For simplicity, keep add-ons running on Heroku initially. You could move over the database to RDS first. However, it's a
257
+ bit simpler to isolate any differences in cost and performance by first moving over your compute to Control Plane.
258
+
259
+ Ensure that your Control Plane compute is in the AWS region `US-EAST-1`; otherwise, you'll have noticeable extra latency
260
+ with your calls to resources. You might also have egress charges from Control Plane.
261
+
262
+ Use the `cpflow promote-app-from-upstream` command to promote the staging app to production.
data/docs/postgres.md ADDED
@@ -0,0 +1,436 @@
1
+ # Migrating Postgres database from Heroku infrastructure
2
+
3
+ One of the biggest problems that will appear when moving from Heroku infrastructure is migrating the database. And
4
+ despite it being rather easy if done between Heroku-hosted databases or non-Heroku-hosted databases (as Postgres has
5
+ tools to do that naturally) it is not easily possible between Heroku and anything outside Heroku,
6
+ **as Heroku doesn't allow setting up WAL replication for Postgres**. Period. No... any... replication outside of
7
+ Heroku infrastructure for Postgres.
8
+
9
+ Previously, it was said to be possible to ask Heroku support to manually set up WAL log shipping, but they
10
+ don't want to do that anymore. Which leaves only 2 options:
11
+
12
+ ### Option A: dump and restore way
13
+
14
+ Nothing problematic here in general **if you can withstand long application maintenance time**.
15
+ You basically need to:
16
+
17
+ 1. enable maintenance
18
+ 2. stop the application completely and wait for all the database writes to finish
19
+ 3. dump database on Heroku
20
+ 4. restore the database on RDS
21
+ 5. start the application
22
+ 6. disable maintenance
23
+
24
+ And if the database is small or it is a hobby app, this should not be looked any further.
25
+ However, this is not acceptable for 99% of production apps as their databases are huge and maintenance time
26
+ should be as small as possible.
27
+
28
+ Rough timing for a 1Gb database can be (but your mileage may vary):
29
+
30
+ - 2.5h creating Heroku backup
31
+ - 0.5h downloading backup to EC2
32
+ - 13h restoring a backup on RDS (in 4 threads)
33
+
34
+ **~16h total time, equals maintenance downtime**
35
+
36
+ ### Option B: logical replication way
37
+
38
+ There are several logical replication solutions exist for Postgres - Slony, Bucardo, Londiste, Mimeo, but... when
39
+ you try to dig deeper, the only viable and up-to-date solution for purpose of migrating from Heroku to RDS is Bucardo.
40
+
41
+ The migration process with Bucardo looks as follows:
42
+
43
+ 1. setup Bucardo on the dedicated EC2 instance
44
+ 2. dump Heroku database schema and restore it on RDS - rather fast as there is no data
45
+ 3. start Bucardo replication - this will install triggers and special schema to your database
46
+ 4. wait for replication to catch up - this may take a long time, but the application can continue working as usual
47
+ 5. enable maintenance
48
+ 6. stop the application completely and wait for replication to finally finish
49
+ 7. switch the database connection strings
50
+ 8. start the application
51
+ 9. disable maintenance
52
+
53
+ Maintenance downtime here can be minutes not hours or days like in p1, but no free lunches - the process is more complex.
54
+
55
+ Rough timing for a 1Gb database can be (but your mileage may vary):
56
+
57
+ - whatever setup time, no hurry
58
+ - 1.5 days for onetimecopy (in 1 thread) - DDL changes not allowed, but no downtime
59
+ - 1-2 min for database switch, maintenance downtime
60
+
61
+ **~2 days total time, ~1-2 min maintenance downtime**
62
+
63
+ ### Some considerations:
64
+
65
+ - DDL changes should be "frozen and postponed" while Bucardo replication. There is also a way to stop replication,
66
+ update DDL in both databases, and restart replication, however, as well no-DDL for a day or two seems a
67
+ reasonable restriction for production databases vs potential errors.
68
+
69
+ - there is a "speed up" option to restore dump (with threads) and then run Bucardo to catch up only deltas, but it
70
+ looks unnecessary as speed gain is minimal vs potential errors. It will not speed up things dramatically, but will just
71
+ save a couple of hours of non-maintenance time (which most probably be spent on the command line) so not worth doing.
72
+
73
+ ## Before replication
74
+
75
+ ### Application code changes
76
+
77
+ Before everything, we need to recheck the database schema and ensure **that every table has a primary key (PK)
78
+ in place** as Bucardo is using PKs for replication.
79
+
80
+ > NOTE: theoretically Bucardo can work with uniq indexes as well, but having a PK on each table is easy and avoids
81
+ unnesessary complications
82
+
83
+ So, please stop, and do whatever is needed for your application.
84
+
85
+ ### Choosing database location and EC2 location
86
+
87
+ All Heroku Postgres databases for location US are running in AWS on `us-east-1`. Control Plane, on the other side,
88
+ recommends `us-east-2` as a default. So we need to choose:
89
+
90
+ - either simple setup - main database in `us-east-2`
91
+ - or a bit more complex - main database in `us-east-2`, replica in `us-east-1` (which can be removed later)
92
+
93
+ This makes sense if your application supports working with replicas. Then read-only `SELECT` queries will go to the
94
+ read replica and write `INSERT/UPDATE` queries will go to the main write-enabled database.
95
+ This way we can keep most reading latency to the minimum.
96
+
97
+ Anyway, it is worth to consider developing such a mode in the application if you want to scale in more than 1 region.
98
+
99
+ ### Create new EC2 instance which we will use for database replication
100
+
101
+ - better if it will be in the same AWS location where RDS database will be (most probably `us-east-2`)
102
+ - choose Ubuntu as OS
103
+ - use some bigger instance, e.g. `m6i.4xlarge` - price doesn't matter much, as such instance will not run long time
104
+ - if you will be copying backup via this instance, choose sufficient space for both OS and backup and some free space
105
+ - create security group `public-access` with all inbound and outbound traffic allowed. this will be handy as well for
106
+ database setup. if you need tighter access controls, up to you
107
+ - generate a new certificate and save it locally (e.g. `bucardo.pem`), will be used for SSH connection. Do not forget to
108
+ update correct permissions e.g. `chown TODO ~/Dowloads/bucardo.pem`
109
+
110
+ After the instance will be running on AWS, you can connect to it via SSH as follows:
111
+ ```sh
112
+ ssh ubuntu@1.2.3.4 -i ~/Downloads/bucardo.pem
113
+ ```
114
+
115
+ ### Creating RDS instance
116
+
117
+ - check `public` box
118
+ - pick `public-access` security group (or whatever you need)
119
+ - if you will be restoring from backup, it is possible to choose temporary bigger instance e.g. `db.r6i.4xlarge`, which
120
+ can be downgraded later.
121
+ - if you will be using bucardo onetimecopy, then it is ok to select any instance you may need, as bucardo does copying
122
+ in a single thread
123
+ - it is fairly easy to switch database instance type afterwards and requires only minimal downtime
124
+ - storage space. this needs a good pick, as it is a) not possible to shrink and b) auto-expanding which AWS offers and
125
+ which should be enabled by default can block database modifications for quite long periods (days)
126
+
127
+ ### Running commands in detached mode on the EC2 instance
128
+
129
+ Some commands that run on EC2 may take a long time and we may want to disconnect from the SSH session while the command
130
+ will continue running. And we want to reconnect to the session and see the progress. Possibly without installing
131
+ special tools. This can be accomplished with `screen` command, e.g.:
132
+
133
+ ```sh
134
+ # this will start a backround process and return to terminal (which can be closed)
135
+ screen -dmL ...your command...
136
+
137
+ # checking if screen is still running in the background
138
+ ps aux | grep -i screen
139
+
140
+ # see the output log
141
+ cat screenlog.0
142
+ ```
143
+
144
+ ### Installing Postgres and Bucardo on EC2
145
+
146
+ Now, when RDS is running and EC2 is running we can start installing local Postgres and Bucardo itself. Let's install
147
+ Postgres 13 first. It may be possible to install the latest Postgres, but 13 seems the best choice atm.
148
+
149
+ ```sh
150
+ # update all your packages
151
+ sudo apt update
152
+ sudo apt upgrade -y
153
+
154
+ # add postgres repository key
155
+ sudo sh -c 'echo "deb [arch=$(dpkg --print-architecture)] http://apt.postgresql.org/pub/repos/apt $(lsb_release -cs)-pgdg main" > /etc/apt/sources.list.d/pgdg.list'
156
+
157
+ wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | sudo apt-key add -
158
+
159
+ # update again
160
+ sudo apt-get update
161
+
162
+ # install packages
163
+ sudo apt-get -y install make postgresql-13 postgresql-client-13 postgresql-plperl-13 libdbd-pg-perl libdbix-safe-perl
164
+ ```
165
+
166
+ Postgres Perl language `plperl` as well as DBD and DBIx packages are needed for Bucardo.
167
+
168
+ Now, as all dependencies are installed, we can install Bucardo from latest tarball.
169
+
170
+ ```sh
171
+ # install Bucardo itself
172
+ wget https://bucardo.org/downloads/Bucardo-5.6.0.tar.gz
173
+ tar xzf Bucardo-5.6.0.tar.gz
174
+ cd Bucardo-5.6.0
175
+ perl Makefile.PL
176
+ sudo make install
177
+
178
+ # create dirs and fix permissions
179
+ sudo mkdir /var/run/bucardo
180
+ sudo mkdir /var/log/bucardo
181
+ sudo chown ubuntu /var/run/bucardo/
182
+ sudo chown ubuntu /var/log/bucardo
183
+ ```
184
+
185
+ After that, Bucardo is physically installed as a package and runnable but we need to configure everything.
186
+ Let's start with Postgres. As this is a temporary installation (only for the period of replication),
187
+ it is rather safe to set `trust` localhost connections (or set up another way if you want this).
188
+
189
+
190
+ For this, we need to edit `pg_hba.conf` as follows:
191
+ ```sh
192
+ # edit pg config to make postgres trusted
193
+ sudo nano /etc/postgresql/13/main/pg_hba.conf
194
+ ```
195
+
196
+ in that file change the following lines to `trust`
197
+ ```sh
198
+ # in pg_hba.conf
199
+ local all postgres trust
200
+ local all all trust
201
+ ```
202
+
203
+ and restart Postgres to pick up changes
204
+ ```sh
205
+ # restart postgres
206
+ sudo systemctl restart postgresql
207
+ ```
208
+
209
+ And finally-finally we can install Bucardo service database on local Postgres and see if everything runs.
210
+
211
+ ```sh
212
+ # this will create local bucardo "service" database
213
+ bucardo install
214
+
215
+ # for option 3 pick `postgres` as a user
216
+ # for option 4 pick `postgres` as a database from where initial connection should be attempted
217
+ ```
218
+
219
+ :tada: :tada: :tada: now we have local Postgres and Bucardo running, and can continue with external services
220
+ configuration.
221
+
222
+ ### Configuring external (Heroku, RDS) database connections
223
+
224
+ For this, we will use `pg_service.conf`. It will not work in all places, and sometimes we will need to provide
225
+ connection properties manually, but for many commands, it is very useful.
226
+
227
+ ```sh
228
+ # create and edit .pg_service.conf
229
+ touch ~/.pg_service.conf
230
+ nano ~/.pg_service.conf
231
+ ```
232
+
233
+ ```ini
234
+ # ~/.pg_service.conf
235
+
236
+ [heroku]
237
+ host=ec2-xxx.compute-1.amazonaws.com
238
+ port=5432
239
+ dbname=xxx
240
+ user=xxx
241
+ password=xxx
242
+
243
+ [rds]
244
+ host=xxx.us-east-2.rds.amazonaws.com
245
+ port=5432
246
+ dbname=xxx
247
+ user=postgres
248
+ password=xxx
249
+ ```
250
+
251
+ Test connectivity to databases with:
252
+ ```sh
253
+ psql service=heroku -c '\l+'
254
+ psql service=rds -c '\l+'
255
+ ```
256
+
257
+ You will see all databases set up on each server (and can see their size):
258
+ ```console
259
+ Name | Owner | Encoding | Collate | Ctype | Access privileges | Size | Tablespace | Description
260
+ -------------------+----------+----------+-------------+-------------+-----------------------+-----------+------------+--------------------------------------------
261
+ my-production-db | xxx | UTF8 | en_US.UTF-8 | en_US.UTF-8 | | 821 GB | pg_default |
262
+ postgres | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | | 8205 kB | pg_default | default administrative connection database
263
+ rdsadmin | rdsadmin | UTF8 | en_US.UTF-8 | en_US.UTF-8 | rdsadmin=CTc/rdsadmin+| No Access | pg_default |
264
+ | | | | | rdstopmgr=Tc/rdsadmin | | |
265
+ template0 | rdsadmin | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/rdsadmin +| 8033 kB | pg_default | unmodifiable empty database
266
+ | | | | | rdsadmin=CTc/rdsadmin | | |
267
+ template1 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/postgres +| 8205 kB | pg_default | default template for new databases
268
+ | | | | | postgres=CTc/postgres | | |
269
+ (5 rows)
270
+ ```
271
+
272
+ After both databases are connectable, we can proceed with replication itself.
273
+
274
+ ## Performing replication
275
+
276
+ :fire: :fire: :fire: **IMPORTANT: from this step, DDL changes are not allowed** :fire: :fire: :fire:
277
+
278
+ ### Application changes
279
+
280
+ - temporary freeze all DDL changes till replication will finish
281
+ - temporary disable all background jobs or services that are possible to stop. This will help to lessen database
282
+ load and especially, where possible, to decrease database write operations that will decrease replication pipe as well.
283
+
284
+ ### Dump and restore the initial schema
285
+
286
+ This step doesn't take much time (as it is only a database schema without data),
287
+ but it is definitely handy to save all output, and closely check for any errors.
288
+
289
+ ```sh
290
+ # save heroku schema to `schema.sql`
291
+ pg_dump service=heroku --schema-only --no-acl --no-owner -v > schema.sql
292
+
293
+ # restore `schema.sql` on RDS
294
+ psql service=rds < schema.sql
295
+ ```
296
+
297
+ ### Configure Bucardo replication
298
+
299
+ After we have all databases connectable and all same schema, we can tell Bucardo what it needs to replicate:
300
+ ```sh
301
+ # add databases
302
+ bucardo add db from_db dbhost=xxx dbport=5432 dbuser=xxx dbpass=xxx dbname=xxx
303
+ bucardo add db to_db dbhost=xxx dbport=5432 dbuser=postgres dbpass=xxx dbname=xxx
304
+
305
+ # mark all tables and sequences for replication
306
+ bucardo add all tables
307
+ bucardo add all sequences
308
+ ```
309
+ Here, Bucardo will connect to databases and collect object metadata for replication.
310
+ After that, we can add sync as well:
311
+
312
+ ```sh
313
+ # add sync itself
314
+ bucardo add sync mysync tables=all dbs=from_db,to_db onetimecopy=1
315
+ ```
316
+
317
+ The most important option here is `onetimecopy=1` which will tell Bucardo to perform initial data copying
318
+ (when sync will start). Such a copying is done *in a single thread* by creating a pipe (via Bucardo) as follows:
319
+ ```SQL
320
+ -- on heroku
321
+ COPY xxx TO STDOUT
322
+ -- on rds
323
+ COPY xxx FROM STDIN
324
+ ```
325
+
326
+ ### Run sync
327
+
328
+ And now, when everything is ready, we can push the button and go for a long :coffee: or maybe even a weekend.
329
+
330
+ ```sh
331
+ # starts Bucardo sync daemon
332
+ bucardo start
333
+ ```
334
+
335
+ As well it is ok to disconnect from SSH as Bucardo daemon will continue working in background.
336
+
337
+ ### Monitor status
338
+
339
+ To check the progress of sync (from Bucardo perspective):
340
+ ```sh
341
+ # overall progress of all syncs
342
+ bucardo status
343
+
344
+ # single sync progress
345
+ bucardo status mysync
346
+ ```
347
+
348
+ To check what's going on in databases directly:
349
+ ```sh
350
+ # Bucardo adds a comment to it's queries, so it is fairly easy to grep those
351
+ psql service=heroku -c 'select * from pg_stat_activity' | grep -i bucardo
352
+ psql service=rds -c 'select * from pg_stat_activity' | grep -i bucardo
353
+ ```
354
+
355
+ ### After replication will catch up, but before databases switch
356
+
357
+ 1. Please do a sanity check of data in tables. E.g. check:
358
+
359
+ - table `COUNT`
360
+ - min/max of PK ids where applicable
361
+ - min/max of `created_at/updated_at` where applicable
362
+
363
+ 2. For p1 it is possible to use our checker script that will do this automatically (TODO)
364
+
365
+ 3. Refresh materialized views manually (as they are not synced by Bucardo).
366
+ Just go to `psql` and `REFRESH MATERIALIZED VIEW ...`
367
+
368
+ ## Switch databases
369
+
370
+ :fire: :fire: :fire: **This is final non reverisble step now** :fire: :fire: :fire:
371
+ Before this point, all changes can be easily removed or reversed and database can stay on Heroku as it was before,
372
+ afther this switch it is not possible (at least easily).
373
+
374
+ So... after sync will catch up, basically it is needed to:
375
+
376
+ 1. start maintenance mode on heroku `heroku maintenance:on`
377
+ 2. scale down and stop all the dynos
378
+ 3. wait a bit for all queries to finish and replication catch up latest changes
379
+ 4. detach heroku postgres from DATABASE_URL
380
+ 5. set `DATABASE_URL` to RDS url (plaintext now)
381
+ 6. start dynos
382
+ 7. wait for their readiness with `heroku ps:wait`
383
+ 8. stop maintenance with `heroku maintenance:off`
384
+ 9. :fire: **Now we are fully on RDS, so DDL changes are allowed** :fire:
385
+ 10. you could gradually enable all background jobs and services which were temporary stopped
386
+
387
+ ## After switch
388
+
389
+ As we now running on RDS, there is only single task left to do on Heroku - make a final backup of database and save it.
390
+
391
+ ```sh
392
+ # to capture backup (will take lots of time), can be disconnected
393
+ heroku pg:backups:capture -a example-app
394
+
395
+ # to get url of backup
396
+ heroku pg:backups:url bXXXX -a example-app
397
+ ```
398
+
399
+ Now you can download it locally or copy to S3 via EC2 as it will take quite some time and traffic.
400
+
401
+ ```sh
402
+ # download dump to EC2
403
+ screen -dmL time curl 'your-url' -o latest.dump
404
+
405
+ # install aws cli (in a way reccomended by Amazon)
406
+ # ...TODO...
407
+
408
+ # configure aws credentials
409
+ aws configure
410
+
411
+ # check S3 access
412
+ aws s3 ls
413
+
414
+ # upload to S3
415
+ screen -dmL time aws s3 cp latest.dump s3://my-dumps-bucket/ --region us-east-1
416
+ ```
417
+
418
+ # Refs
419
+
420
+ https://bucardo.org
421
+
422
+ https://stackoverflow.com/questions/22264753/linux-how-to-install-dbdpg-module
423
+
424
+ https://gist.github.com/luizomf/1a7994cf4263e10dce416a75b9180f01
425
+
426
+ https://www.waytoeasylearn.com/learn/bucardo-installation/
427
+
428
+ https://gist.github.com/knasyrov/97301801733a31c60521
429
+
430
+ https://www.cloudnation.nl/inspiratie/blogs/migrating-heroku-postgresql-to-aurora-rds-with-almost-minimal-downtime
431
+
432
+ https://blog.porter.run/migrating-postgres-from-heroku-to-rds/
433
+
434
+ https://www.endpointdev.com/blog/2017/06/amazon-aws-upgrades-to-postgres-with/
435
+
436
+ https://aws.amazon.com/blogs/database/migrating-legacy-postgresql-databases-to-amazon-rds-or-aurora-postgresql-using-bucardo/