pg_easy_replicate 0.2.1 → 0.2.3

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 2a39a74d7e35cec6dce951d23eff51972e22811779ab8637edfa989567e62620
4
- data.tar.gz: 6a81b795c7968b5d5c218f35084cdf40bf558271fc8e290ef75fe28a570ce9b6
3
+ metadata.gz: 86261e2b60e55c5770bbd1537d5a3a501a514d1eaf8e741c5c81890d864b0a0f
4
+ data.tar.gz: bf6a710930b4f9533d01c3f66e3b91c02ad5a422df9d17be2bbaf72cdb979a2e
5
5
  SHA512:
6
- metadata.gz: f9af65ec14a25975af9fab3efa884d91aa90a43aae74b355370608b3c5e61453f6fd71dd333bd8917193f8b90ec86965464893f9aae2f1a965f2b40eaeebc882
7
- data.tar.gz: 9643dc3de2102b82ab29d085e90b3dfbbfa923d0c3324d91e1bff8fa7082a8136263e59566559cabbf230a3094648a2a52152912b24dd760ffc3c727b5a11f39
6
+ metadata.gz: 5d7aab6e89948a45248fcb19fb6d3e9a6982adbc23cba094fe413b2a376cbffb1bbfdb011833f16c4c3aeb3c4d3e1192792699b4c360a20555fc05972e2ae111
7
+ data.tar.gz: a4c72947b466664dfc716a2ed21cab159d0f6768b3ae74c9c95db105bafb22da57244236a1f9677e1936af8140989dae9a0c5b1729c2f4bc62dfcbbad0d6a007
data/CHANGELOG.md CHANGED
@@ -1,4 +1,8 @@
1
- ## [0.2.1] - 2023-12-29
1
+ ## [0.2.1] - 2024-01-21
2
+
3
+ - Extend config check to assert for REPLICA IDENTITY on tables and drop index bug - #88
4
+
5
+ ## [0.2.1] - 2024-01-20
2
6
 
3
7
  - Don't attempt to drop and recreate unique indices - #88
4
8
  - Dependency updates
data/Gemfile.lock CHANGED
@@ -1,7 +1,7 @@
1
1
  PATH
2
2
  remote: .
3
3
  specs:
4
- pg_easy_replicate (0.2.1)
4
+ pg_easy_replicate (0.2.3)
5
5
  ougai (~> 2.0.0)
6
6
  pg (~> 1.5.3)
7
7
  sequel (>= 5.69, < 5.77)
data/README.md CHANGED
@@ -4,7 +4,7 @@
4
4
  [![Smoke spec](https://github.com/shayonj/pg_easy_replicate/actions/workflows/smoke.yaml/badge.svg?branch=main)](https://github.com/shayonj/pg_easy_replicate/actions/workflows/ci.yaml)
5
5
  [![Gem Version](https://badge.fury.io/rb/pg_easy_replicate.svg?2)](https://badge.fury.io/rb/pg_easy_replicate)
6
6
 
7
- `pg_easy_replicate` is a CLI orchestrator tool that simplifies the process of setting up [logical replication](https://www.postgresql.org/docs/current/logical-replication.html) between two PostgreSQL databases. `pg_easy_replicate` also supports switchover. After the source (primary database) is fully replicated, `pg_easy_replicate` puts it into read-only mode and via logical replication flushes all data to the new target database. This ensures zero data loss and minimal downtime for the application. This method can be useful for performing minimal downtime (up to <1min, depending) major version upgrades between two PostgreSQL databases, load testing with blue/green database setup and other similar use cases.
7
+ `pg_easy_replicate` is a CLI orchestrator tool that simplifies the process of setting up [logical replication](https://www.postgresql.org/docs/current/logical-replication.html) between two PostgreSQL databases. `pg_easy_replicate` also supports switchover. After the source (primary database) is fully replicated, `pg_easy_replicate` puts it into read-only mode and via logical replication flushes all data to the new target database. This ensures zero data loss and minimal downtime for the application. This method can be useful for performing minimal downtime (up to <1min, depending) major version upgrades between a Blue/Green PostgreSQL database setup, load testing and other similar use cases.
8
8
 
9
9
  Battle tested in production at [Tines](https://www.tines.com/) 🚀
10
10
 
@@ -19,7 +19,7 @@ Battle tested in production at [Tines](https://www.tines.com/) 🚀
19
19
  - [Config check](#config-check)
20
20
  - [Bootstrap](#bootstrap)
21
21
  - [Bootstrap and Config Check with special user role in AWS or GCP](#bootstrap-and-config-check-with-special-user-role-in-aws-or-gcp)
22
- - [Config Check](#config-check)
22
+ - [Config Check](#config-check-1)
23
23
  - [Bootstrap](#bootstrap-1)
24
24
  - [Start sync](#start-sync)
25
25
  - [Stats](#stats)
@@ -29,7 +29,7 @@ Battle tested in production at [Tines](https://www.tines.com/) 🚀
29
29
  - [Rolling restart strategy](#rolling-restart-strategy)
30
30
  - [DNS Failover strategy](#dns-failover-strategy)
31
31
  - [FAQ](#faq)
32
- - [Adding internal user to pgBouncer `userlist`](#adding-internal-user-to-pgbouncer-userlist)
32
+ - [Adding internal user to `pg_hba` or pgBouncer `userlist`](#adding-internal-user-to-pg_hba-or-pgbouncer-userlist)
33
33
  - [Contributing](#contributing)
34
34
 
35
35
  ## Installation
@@ -61,6 +61,7 @@ https://hub.docker.com/r/shayonj/pg_easy_replicate
61
61
  - PostgreSQL 10 and later
62
62
  - Ruby 3.0 and later
63
63
  - Database users should have `SUPERUSER` permissions, or pass in a special user with privileges to create the needed role, schema, publication and subscription on both databases. More on `--special-user-role` section below.
64
+ - See more on [FAQ](#faq) below
64
65
 
65
66
  ## Limits
66
67
 
@@ -256,9 +257,9 @@ Next, you can set up a program that watches the `stats` and waits until `switcho
256
257
 
257
258
  ## FAQ
258
259
 
259
- ### Adding internal user to pgBouncer `userlist`
260
+ ### Adding internal user to `pg_hba` or pgBouncer `userlist`
260
261
 
261
- `pg_easy_replicate` creates a special user to orchestrate the replication. If you use pgBouncer, you may need to allow `pger_su_h1a4fb` as a user that can perform login by adding it to the `userlist`.
262
+ `pg_easy_replicate` sets up a designated user for managing the replication process. In case you handle user permissions through `pg_hba`, it's necessary to modify this list to permit sessions from `pger_su_h1a4fb`. Similarly, with pgBouncer, you'll need to authorize `pger_su_h1a4fb` for login access by including it in the `userlist`.
262
263
 
263
264
  ## Contributing
264
265
 
@@ -16,10 +16,21 @@ module PgEasyReplicate
16
16
  aliases: "-c",
17
17
  boolean: true,
18
18
  desc: "Copy schema to the new database"
19
+ method_option :tables,
20
+ aliases: "-t",
21
+ default: "",
22
+ desc:
23
+ "Comma separated list of table names. Default: All tables"
24
+ method_option :schema_name,
25
+ aliases: "-s",
26
+ desc:
27
+ "Name of the schema tables are in, only required if passing list of tables"
19
28
  def config_check
20
29
  PgEasyReplicate.assert_config(
21
30
  special_user_role: options[:special_user_role],
22
31
  copy_schema: options[:copy_schema],
32
+ tables: options[:tables],
33
+ schema_name: options[:schema_name],
23
34
  )
24
35
 
25
36
  puts "✅ Config is looking good."
@@ -77,6 +88,7 @@ module PgEasyReplicate
77
88
  "Name of the schema tables are in, only required if passing list of tables"
78
89
  method_option :tables,
79
90
  aliases: "-t",
91
+ default: "",
80
92
  desc:
81
93
  "Comma separated list of table names. Default: All tables"
82
94
  method_option :recreate_indices_post_copy,
@@ -107,8 +119,12 @@ module PgEasyReplicate
107
119
  desc: "Name of the group previously provisioned"
108
120
  method_option :lag_delta_size,
109
121
  aliases: "-l",
110
- desc:
111
- "The size of the lag to watch for before switchover. Default 200KB."
122
+ desc: "The size of the lag to watch for before switchover. Default 200KB."
123
+ method_option :skip_vacuum_analyze,
124
+ type: :boolean,
125
+ default: false,
126
+ aliases: "-s",
127
+ desc: "Skip vacuum analyzing tables before switchover."
112
128
  # method_option :bi_directional,
113
129
  # aliases: "-b",
114
130
  # desc:
@@ -117,6 +133,7 @@ module PgEasyReplicate
117
133
  PgEasyReplicate::Orchestrate.switchover(
118
134
  group_name: options[:group_name],
119
135
  lag_delta_size: options[:lag_delta_size],
136
+ skip_vacuum_analyze: options[:skip_vacuum_analyze]
120
137
  )
121
138
  end
122
139
 
@@ -72,5 +72,32 @@ module PgEasyReplicate
72
72
  raise(msg) if test_env?
73
73
  abort(msg)
74
74
  end
75
+
76
+ def determine_tables(conn_string:, list: "", schema: nil)
77
+ schema ||= "public"
78
+
79
+ tables = list&.split(",") || []
80
+ if tables.size > 0
81
+ tables
82
+ else
83
+ list_all_tables(schema: schema, conn_string: conn_string)
84
+ end
85
+ end
86
+
87
+ def list_all_tables(schema:, conn_string:)
88
+ Query
89
+ .run(
90
+ query:
91
+ "SELECT table_name
92
+ FROM information_schema.tables
93
+ WHERE table_schema = '#{schema}' AND
94
+ table_type = 'BASE TABLE'
95
+ ORDER BY table_name",
96
+ connection_url: conn_string,
97
+ user: db_user(conn_string),
98
+ )
99
+ .map(&:values)
100
+ .flatten
101
+ end
75
102
  end
76
103
  end
@@ -17,7 +17,7 @@ module PgEasyReplicate
17
17
  tables: tables,
18
18
  schema: schema,
19
19
  ).each do |index|
20
- drop_sql = "DROP INDEX CONCURRENTLY #{index[:index_name]};"
20
+ drop_sql = "DROP INDEX CONCURRENTLY #{schema}.#{index[:index_name]};"
21
21
 
22
22
  Query.run(
23
23
  query: drop_sql,
@@ -56,8 +56,8 @@ module PgEasyReplicate
56
56
  end
57
57
 
58
58
  def self.fetch_indices(conn_string:, tables:, schema:)
59
- return [] if tables.split(",").empty?
60
- table_list = tables.split(",").map { |table| "'#{table}'" }.join(",")
59
+ return [] if tables.empty?
60
+ table_list = tables.map { |table| "'#{table}'" }.join(",")
61
61
 
62
62
  sql = <<-SQL
63
63
  SELECT
@@ -46,7 +46,7 @@ module PgEasyReplicate
46
46
 
47
47
  Group.create(
48
48
  name: options[:group_name],
49
- table_names: tables,
49
+ table_names: tables.join(","),
50
50
  schema_name: schema_name,
51
51
  started_at: Time.now.utc,
52
52
  recreate_indices_post_copy: options[:recreate_indices_post_copy],
@@ -63,7 +63,7 @@ module PgEasyReplicate
63
63
  else
64
64
  Group.create(
65
65
  name: options[:group_name],
66
- table_names: tables,
66
+ table_names: tables.join(","),
67
67
  schema_name: schema_name,
68
68
  started_at: Time.now.utc,
69
69
  failed_at: Time.now.utc,
@@ -92,42 +92,24 @@ module PgEasyReplicate
92
92
  schema:,
93
93
  group_name:,
94
94
  conn_string:,
95
- tables: ""
95
+ tables: []
96
96
  )
97
97
  logger.info(
98
98
  "Adding tables up publication",
99
99
  { publication_name: publication_name(group_name) },
100
100
  )
101
101
 
102
- tables
103
- .split(",")
104
- .map do |table_name|
105
- Query.run(
106
- query:
107
- "ALTER PUBLICATION #{quote_ident(publication_name(group_name))}
108
- ADD TABLE #{quote_ident(table_name)}",
109
- connection_url: conn_string,
110
- schema: schema,
111
- )
112
- end
113
- rescue => e
114
- raise "Unable to add tables to publication: #{e.message}"
115
- end
116
-
117
- def list_all_tables(schema:, conn_string:)
118
- Query
119
- .run(
102
+ tables.map do |table_name|
103
+ Query.run(
120
104
  query:
121
- "SELECT table_name
122
- FROM information_schema.tables
123
- WHERE table_schema = '#{schema}' AND
124
- table_type = 'BASE TABLE'
125
- ORDER BY table_name",
105
+ "ALTER PUBLICATION #{quote_ident(publication_name(group_name))}
106
+ ADD TABLE #{quote_ident(table_name)}",
126
107
  connection_url: conn_string,
108
+ schema: schema,
127
109
  )
128
- .map(&:values)
129
- .flatten
130
- .join(",")
110
+ end
111
+ rescue => e
112
+ raise "Unable to add tables to publication: #{e.message}"
131
113
  end
132
114
 
133
115
  def drop_publication(group_name:, conn_string:)
@@ -222,15 +204,19 @@ module PgEasyReplicate
222
204
  group_name:,
223
205
  source_conn_string: source_db_url,
224
206
  target_conn_string: target_db_url,
225
- lag_delta_size: nil
207
+ lag_delta_size: nil,
208
+ skip_vacuum_analyze: false
226
209
  )
227
210
  group = Group.find(group_name)
211
+ tables_list = group[:table_names].split(",")
228
212
 
229
- run_vacuum_analyze(
230
- conn_string: target_conn_string,
231
- tables: group[:table_names],
232
- schema: group[:schema_name],
233
- )
213
+ unless skip_vacuum_analyze
214
+ run_vacuum_analyze(
215
+ conn_string: target_conn_string,
216
+ tables: tables_list,
217
+ schema: group[:schema_name],
218
+ )
219
+ end
234
220
 
235
221
  watch_lag(group_name: group_name, lag: lag_delta_size || DEFAULT_LAG)
236
222
 
@@ -239,7 +225,7 @@ module PgEasyReplicate
239
225
  IndexManager.recreate_indices(
240
226
  source_conn_string: source_db_url,
241
227
  target_conn_string: target_db_url,
242
- tables: group[:table_names],
228
+ tables: tables_list,
243
229
  schema: group[:schema_name],
244
230
  )
245
231
  end
@@ -255,11 +241,13 @@ module PgEasyReplicate
255
241
  )
256
242
  mark_switchover_complete(group_name)
257
243
  # Run vacuum analyze to refresh the planner post switchover
258
- run_vacuum_analyze(
259
- conn_string: target_conn_string,
260
- tables: group[:table_names],
261
- schema: group[:schema_name],
262
- )
244
+ unless skip_vacuum_analyze
245
+ run_vacuum_analyze(
246
+ conn_string: target_conn_string,
247
+ tables: tables_list,
248
+ schema: group[:schema_name],
249
+ )
250
+ end
263
251
  drop_subscription(
264
252
  group_name: group_name,
265
253
  target_conn_string: target_conn_string,
@@ -369,21 +357,21 @@ module PgEasyReplicate
369
357
  end
370
358
 
371
359
  def run_vacuum_analyze(conn_string:, tables:, schema:)
372
- tables
373
- .split(",")
374
- .each do |t|
375
- logger.info(
376
- "Running vacuum analyze on #{t}",
377
- schema: schema,
378
- table: t,
379
- )
380
- Query.run(
381
- query: "VACUUM VERBOSE ANALYZE #{t}",
382
- connection_url: conn_string,
383
- schema: schema,
384
- transaction: false,
385
- )
386
- end
360
+ tables.each do |t|
361
+ logger.info(
362
+ "Running vacuum analyze on #{t}",
363
+ schema: schema,
364
+ table: t,
365
+ )
366
+
367
+ Query.run(
368
+ query: "VACUUM VERBOSE ANALYZE #{t};",
369
+ connection_url: conn_string,
370
+ schema: schema,
371
+ transaction: false,
372
+ using_vacuum_analyze: true,
373
+ )
374
+ end
387
375
  rescue => e
388
376
  raise "Unable to run vacuum and analyze: #{e.message}"
389
377
  end
@@ -391,16 +379,6 @@ module PgEasyReplicate
391
379
  def mark_switchover_complete(group_name)
392
380
  Group.update(group_name: group_name, switchover_completed_at: Time.now)
393
381
  end
394
-
395
- private
396
-
397
- def determine_tables(schema:, conn_string:, list: "")
398
- tables = list&.split(",") || []
399
- unless tables.size > 0
400
- return list_all_tables(schema: schema, conn_string: conn_string)
401
- end
402
- ""
403
- end
404
382
  end
405
383
  end
406
384
  end
@@ -10,20 +10,26 @@ module PgEasyReplicate
10
10
  connection_url:,
11
11
  user: internal_user_name,
12
12
  schema: nil,
13
- transaction: true
13
+ transaction: true,
14
+ using_vacuum_analyze: false
14
15
  )
15
16
  conn =
16
17
  connect(connection_url: connection_url, schema: schema, user: user)
18
+ timeout ||= "5s"
17
19
  if transaction
18
20
  r =
19
21
  conn.transaction do
20
22
  conn.run("SET search_path to #{quote_ident(schema)}") if schema
21
- conn.run("SET statement_timeout to '5s'")
23
+ conn.run("SET statement_timeout to '#{timeout}'")
22
24
  conn.fetch(query).to_a
23
25
  end
24
26
  else
25
27
  conn.run("SET search_path to #{quote_ident(schema)}") if schema
26
- conn.run("SET statement_timeout to '5s'")
28
+ if using_vacuum_analyze
29
+ conn.run("SET statement_timeout=0")
30
+ else
31
+ conn.run("SET statement_timeout to '5s'")
32
+ end
27
33
  r = conn.fetch(query).to_a
28
34
  end
29
35
  conn.disconnect
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module PgEasyReplicate
4
- VERSION = "0.2.1"
4
+ VERSION = "0.2.3"
5
5
  end
@@ -26,7 +26,12 @@ module PgEasyReplicate
26
26
  extend Helper
27
27
 
28
28
  class << self
29
- def config(special_user_role: nil, copy_schema: false)
29
+ def config(
30
+ special_user_role: nil,
31
+ copy_schema: false,
32
+ tables: "",
33
+ schema_name: nil
34
+ )
30
35
  abort_with("SOURCE_DB_URL is missing") if source_db_url.nil?
31
36
  abort_with("TARGET_DB_URL is missing") if target_db_url.nil?
32
37
 
@@ -56,15 +61,31 @@ module PgEasyReplicate
56
61
  user: db_user(target_db_url),
57
62
  ),
58
63
  pg_dump_exists: pg_dump_exists,
64
+ tables_have_replica_identity:
65
+ tables_have_replica_identity?(
66
+ conn_string: source_db_url,
67
+ tables: tables,
68
+ schema_name: schema_name,
69
+ ),
59
70
  }
60
71
  rescue => e
61
72
  abort_with("Unable to check config: #{e.message}")
62
73
  end
63
74
  end
64
75
 
65
- def assert_config(special_user_role: nil, copy_schema: false)
76
+ def assert_config(
77
+ special_user_role: nil,
78
+ copy_schema: false,
79
+ tables: "",
80
+ schema_name: nil
81
+ )
66
82
  config_hash =
67
- config(special_user_role: special_user_role, copy_schema: copy_schema)
83
+ config(
84
+ special_user_role: special_user_role,
85
+ copy_schema: copy_schema,
86
+ tables: tables,
87
+ schema_name: schema_name,
88
+ )
68
89
 
69
90
  if copy_schema && !config_hash.dig(:pg_dump_exists)
70
91
  abort_with("pg_dump must exist if copy_schema (-c) is passed")
@@ -82,6 +103,19 @@ module PgEasyReplicate
82
103
  abort_with("User on source database does not have super user privilege")
83
104
  end
84
105
 
106
+ if tables.split(",").size > 0 && (schema_name.nil? || schema_name == "")
107
+ abort_with("Schema name is required if tables are passed")
108
+ end
109
+
110
+ unless config_hash.dig(:tables_have_replica_identity)
111
+ abort_with(
112
+ "Ensure all tables involved in logical replication have an appropriate replica identity set. This can be done using:
113
+ 1. Default (Primary Key): `ALTER TABLE table_name REPLICA IDENTITY DEFAULT;`
114
+ 2. Unique Index: `ALTER TABLE table_name REPLICA IDENTITY USING INDEX index_name;`
115
+ 3. Full (All Columns): `ALTER TABLE table_name REPLICA IDENTITY FULL;`",
116
+ )
117
+ end
118
+
85
119
  return if config_hash.dig(:target_db_is_super_user)
86
120
  abort_with("User on target database does not have super user privilege")
87
121
  end
@@ -352,5 +386,47 @@ module PgEasyReplicate
352
386
  )
353
387
  .any? { |q| q[:username] == user }
354
388
  end
389
+
390
+ def tables_have_replica_identity?(
391
+ conn_string:,
392
+ tables: "",
393
+ schema_name: nil
394
+ )
395
+ schema_name ||= "public"
396
+
397
+ table_list =
398
+ determine_tables(
399
+ schema: schema_name,
400
+ conn_string: source_db_url,
401
+ list: tables,
402
+ )
403
+ return false if table_list.empty?
404
+
405
+ formatted_table_list = table_list.map { |table| "'#{table}'" }.join(", ")
406
+
407
+ sql = <<~SQL
408
+ SELECT t.relname AS table_name,
409
+ CASE
410
+ WHEN t.relreplident = 'd' THEN 'default'
411
+ WHEN t.relreplident = 'n' THEN 'nothing'
412
+ WHEN t.relreplident = 'i' THEN 'index'
413
+ WHEN t.relreplident = 'f' THEN 'full'
414
+ END AS replica_identity
415
+ FROM pg_class t
416
+ JOIN pg_namespace ns ON t.relnamespace = ns.oid
417
+ WHERE ns.nspname = '#{schema_name}'
418
+ AND t.relkind = 'r'
419
+ AND t.relname IN (#{formatted_table_list})
420
+ SQL
421
+
422
+ results =
423
+ Query.run(
424
+ query: sql,
425
+ connection_url: conn_string,
426
+ user: db_user(conn_string),
427
+ )
428
+
429
+ results.all? { |r| r[:replica_identity] != "nothing" }
430
+ end
355
431
  end
356
432
  end
data/scripts/e2e-start.sh CHANGED
@@ -10,7 +10,7 @@ export SOURCE_DB_URL="postgres://james-bond:james-bond123%407%21%273aaR@localhos
10
10
  export TARGET_DB_URL="postgres://james-bond:james-bond123%407%21%273aaR@localhost:5433/postgres-db"
11
11
  export PGPASSWORD='james-bond123@7!'"'"''"'"'3aaR'
12
12
 
13
- # Bootstrap and cleanup
13
+ # Config check, Bootstrap and cleanup
14
14
  echo "===== Performing Bootstrap and cleanup"
15
15
  bundle exec bin/pg_easy_replicate bootstrap -g cluster-1 --copy-schema
16
16
  bundle exec bin/pg_easy_replicate start_sync -g cluster-1 -s public --recreate-indices-post-copy
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: pg_easy_replicate
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.2.1
4
+ version: 0.2.3
5
5
  platform: ruby
6
6
  authors:
7
7
  - Shayon Mukherjee
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2024-01-20 00:00:00.000000000 Z
11
+ date: 2024-01-28 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: ougai