pgsync 0.6.1 → 0.6.6

Sign up to get free protection for your applications and to get access to all the features.

Potentially problematic release.


This version of pgsync might be problematic. Click here for more details.

checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 6c4a582f7fd5b997ba91247f2b57ad68aae7b5fad282c32dca873d75667922af
4
- data.tar.gz: a7ae6518b84d28c8ad14786793b8d4257d898214b8071ccc6c91e9d4d144e316
3
+ metadata.gz: 1097ac9939cf312566746d7d3ff06604ee5997c9e2b85ae38f7d46d77dbdefc9
4
+ data.tar.gz: 06b70ef312aa796c9bf083b89a7cd6ff3c8a99920d8b1eabdd290cd67d956559
5
5
  SHA512:
6
- metadata.gz: f008b6d7114e3f11395479de0e85006c5eb48681b13ab613e1468d20a99acf0121566fe5499e236d1e8b07fdd12acc2c3a99399daac3e3e99269efac5e6bd4e5
7
- data.tar.gz: 536af357b17f35c7afb0e8efba5411a0ecbec67adbfe364d465a8831dd0e56d108055b1ccb55bc0203d92d9a3215a11d3a287f52e89b40b6526534ddfbc8371e
6
+ metadata.gz: 815aeffa2bd01469b0fbc229a25b8f2ff8b69f28c3bfac21c61f571875bf74914cdaabfe9150d3db271ac2f10dcef4c0c4f542b72f306618b4b2a0985d59d388
7
+ data.tar.gz: 128cd39679c96354d93213e611d679275d2c3de75a29b847aa8cb911c03d13f3b9231f7318a5ea3a4b8bc98684eb6a3ed07975159a9d89ee4aa2ac53746365f7
@@ -1,3 +1,26 @@
1
+ ## 0.6.6 (2020-10-29)
2
+
3
+ - Added support for tables with generated columns
4
+
5
+ ## 0.6.5 (2020-07-10)
6
+
7
+ - Improved help
8
+
9
+ ## 0.6.4 (2020-06-10)
10
+
11
+ - Log SQL with `--debug` option
12
+ - Improved sequence queries
13
+
14
+ ## 0.6.3 (2020-06-09)
15
+
16
+ - Added `--defer-constraints-v2` option
17
+ - Ensure consistent source snapshot with `--disable-integrity`
18
+
19
+ ## 0.6.2 (2020-06-09)
20
+
21
+ - Added support for `--disable-integrity` on Amazon RDS
22
+ - Fixed error when excluded table not found in source
23
+
1
24
  ## 0.6.1 (2020-06-07)
2
25
 
3
26
  - Added Django and Laravel integrations
data/README.md CHANGED
@@ -198,20 +198,20 @@ Rules starting with `unique_` require the table to have a single column primary
198
198
 
199
199
  Foreign keys can make it difficult to sync data. Three options are:
200
200
 
201
- 1. Manually specify the order of tables
202
- 2. Use deferrable constraints
203
- 3. Disable foreign key triggers, which can silently break referential integrity
201
+ 1. Defer constraints (recommended)
202
+ 2. Manually specify the order of tables
203
+ 3. Disable foreign key triggers, which can silently break referential integrity (not recommended)
204
204
 
205
- When manually specifying the order, use `--jobs 1` so tables are synced one-at-a-time.
205
+ To defer constraints, use:
206
206
 
207
207
  ```sh
208
- pgsync table1,table2,table3 --jobs 1
208
+ pgsync --defer-constraints-v2
209
209
  ```
210
210
 
211
- If your tables have [deferrable constraints](https://begriffs.com/posts/2017-08-27-deferrable-sql-constraints.html), use:
211
+ To manually specify the order of tables, use `--jobs 1` so tables are synced one-at-a-time.
212
212
 
213
213
  ```sh
214
- pgsync --defer-constraints
214
+ pgsync table1,table2,table3 --jobs 1
215
215
  ```
216
216
 
217
217
  To disable foreign key triggers and potentially break referential integrity, use:
@@ -220,6 +220,8 @@ To disable foreign key triggers and potentially break referential integrity, use
220
220
  pgsync --disable-integrity
221
221
  ```
222
222
 
223
+ This requires superuser privileges on the `to` database. If syncing to (not from) Amazon RDS, use the `rds_superuser` role. If syncing to (not from) Heroku, there doesn’t appear to be a way to disable integrity.
224
+
223
225
  ## Triggers
224
226
 
225
227
  Disable user triggers with:
@@ -305,6 +307,14 @@ exclude:
305
307
  - schema_migrations
306
308
  ```
307
309
 
310
+ ## Debugging
311
+
312
+ To view the SQL that’s run, use:
313
+
314
+ ```sh
315
+ pgsync --debug
316
+ ```
317
+
308
318
  ## Other Commands
309
319
 
310
320
  Help
@@ -380,6 +390,10 @@ Also check out:
380
390
 
381
391
  Inspired by [heroku-pg-transfer](https://github.com/ddollar/heroku-pg-transfer).
382
392
 
393
+ ## History
394
+
395
+ View the [changelog](https://github.com/ankane/pgsync/blob/master/CHANGELOG.md)
396
+
383
397
  ## Contributing
384
398
 
385
399
  Everyone is encouraged to help improve this project. Here are a few ways you can help:
@@ -18,6 +18,7 @@ require "pgsync/client"
18
18
  require "pgsync/data_source"
19
19
  require "pgsync/init"
20
20
  require "pgsync/schema_sync"
21
+ require "pgsync/sequence"
21
22
  require "pgsync/sync"
22
23
  require "pgsync/table"
23
24
  require "pgsync/table_sync"
@@ -39,39 +39,74 @@ module PgSync
39
39
  def slop_options
40
40
  o = Slop::Options.new
41
41
  o.banner = %{Usage:
42
- pgsync [options]
42
+ pgsync [tables,groups] [sql] [options]}
43
43
 
44
- Options:}
45
- o.string "-d", "--db", "database"
46
- o.string "-t", "--tables", "tables to sync"
47
- o.string "-g", "--groups", "groups to sync"
48
- o.integer "-j", "--jobs", "number of tables to sync at a time"
44
+ # not shown
45
+ o.string "-t", "--tables", "tables to sync", help: false
46
+ o.string "-g", "--groups", "groups to sync", help: false
47
+
48
+ o.separator ""
49
+ o.separator "Table options:"
50
+ o.string "--exclude", "tables to exclude"
49
51
  o.string "--schemas", "schemas to sync"
50
- o.string "--from", "source"
51
- o.string "--to", "destination"
52
- o.string "--exclude", "exclude tables"
53
- o.string "--config", "config file"
54
- o.boolean "--to-safe", "accept danger", default: false
55
- o.boolean "--debug", "debug", default: false
56
- o.boolean "--list", "list", default: false
57
- o.boolean "--overwrite", "overwrite existing rows", default: false, help: false
52
+ o.boolean "--all-schemas", "sync all schemas", default: false
53
+
54
+ o.separator ""
55
+ o.separator "Row options:"
56
+ o.boolean "--overwrite", "overwrite existing rows", default: false
58
57
  o.boolean "--preserve", "preserve existing rows", default: false
59
58
  o.boolean "--truncate", "truncate existing rows", default: false
60
- o.boolean "--schema-first", "schema first", default: false
61
- o.boolean "--schema-only", "schema only", default: false
62
- o.boolean "--all-schemas", "all schemas", default: false
63
- o.boolean "--no-rules", "do not apply data rules", default: false
64
- o.boolean "--no-sequences", "do not sync sequences", default: false
65
- o.boolean "--init", "init", default: false
66
- o.boolean "--in-batches", "in batches", default: false, help: false
67
- o.integer "--batch-size", "batch size", default: 10000, help: false
68
- o.float "--sleep", "sleep", default: 0, help: false
69
- o.boolean "--fail-fast", "stop on the first failed table", default: false
70
- o.boolean "--defer-constraints", "defer constraints", default: false
71
- o.boolean "--disable-user-triggers", "disable non-system triggers", default: false
59
+
60
+ o.separator ""
61
+ o.separator "Foreign key options:"
62
+ o.boolean "--defer-constraints-v2", "defer constraints", default: false
72
63
  o.boolean "--disable-integrity", "disable foreign key triggers", default: false
73
- o.boolean "-v", "--version", "print the version"
74
- o.boolean "-h", "--help", "prints help"
64
+ o.integer "-j", "--jobs", "number of tables to sync at a time"
65
+
66
+ # replaced by v2
67
+ o.boolean "--defer-constraints", "defer constraints", default: false, help: false
68
+ # private, for testing
69
+ o.boolean "--disable-integrity-v2", "disable foreign key triggers", default: false, help: false
70
+
71
+ o.separator ""
72
+ o.separator "Schema options:"
73
+ o.boolean "--schema-first", "sync schema first", default: false
74
+ o.boolean "--schema-only", "sync schema only", default: false
75
+
76
+ o.separator ""
77
+ o.separator "Config options:"
78
+ # technically, defaults to searching path for .pgsync.yml, but this is simpler
79
+ o.string "--config", "config file (defaults to .pgsync.yml)"
80
+ o.string "-d", "--db", "database-specific config file"
81
+
82
+ o.separator ""
83
+ o.separator "Connection options:"
84
+ o.string "--from", "source database URL"
85
+ o.string "--to", "destination database URL"
86
+ o.boolean "--to-safe", "confirms destination is safe (when not localhost)", default: false
87
+
88
+ o.separator ""
89
+ o.separator "Other options:"
90
+ o.boolean "--debug", "show SQL statements", default: false
91
+ o.boolean "--disable-user-triggers", "disable non-system triggers", default: false
92
+ o.boolean "--fail-fast", "stop on the first failed table", default: false
93
+ o.boolean "--no-rules", "don't apply data rules", default: false
94
+ o.boolean "--no-sequences", "don't sync sequences", default: false
95
+
96
+ # not shown in help
97
+ # o.separator ""
98
+ # o.separator "Append-only table options:"
99
+ o.boolean "--in-batches", "sync in batches", default: false, help: false
100
+ o.integer "--batch-size", "batch size", default: 10000, help: false
101
+ o.float "--sleep", "time to sleep between batches", default: 0, help: false
102
+
103
+ o.separator ""
104
+ o.separator "Other commands:"
105
+ o.boolean "--init", "create config file", default: false
106
+ o.boolean "--list", "list tables", default: false
107
+ o.boolean "-h", "--help", "print help"
108
+ o.boolean "-v", "--version", "print version"
109
+
75
110
  o
76
111
  end
77
112
  end
@@ -4,8 +4,10 @@ module PgSync
4
4
 
5
5
  attr_reader :url
6
6
 
7
- def initialize(url)
7
+ def initialize(url, name:, debug:)
8
8
  @url = url
9
+ @name = name
10
+ @debug = debug
9
11
  end
10
12
 
11
13
  def exists?
@@ -50,10 +52,6 @@ module PgSync
50
52
  table_set.include?(table)
51
53
  end
52
54
 
53
- def sequences(table, columns)
54
- execute("SELECT #{columns.map { |f| "pg_get_serial_sequence(#{escape("#{quote_ident_full(table)}")}, #{escape(f)}) AS #{quote_ident(f)}" }.join(", ")}").first.values.compact
55
- end
56
-
57
55
  def max_id(table, primary_key, sql_clause = nil)
58
56
  execute("SELECT MAX(#{quote_ident(primary_key)}) FROM #{quote_ident_full(table)}#{sql_clause}").first["max"].to_i
59
57
  end
@@ -62,39 +60,14 @@ module PgSync
62
60
  execute("SELECT MIN(#{quote_ident(primary_key)}) FROM #{quote_ident_full(table)}#{sql_clause}").first["min"].to_i
63
61
  end
64
62
 
65
- # this value comes from pg_get_serial_sequence which is already quoted
66
63
  def last_value(seq)
67
- execute("SELECT last_value FROM #{seq}").first["last_value"]
64
+ execute("SELECT last_value FROM #{quote_ident_full(seq)}").first["last_value"]
68
65
  end
69
66
 
70
67
  def truncate(table)
71
68
  execute("TRUNCATE #{quote_ident_full(table)} CASCADE")
72
69
  end
73
70
 
74
- # https://stackoverflow.com/a/20537829
75
- # TODO can simplify with array_position in Postgres 9.5+
76
- def primary_key(table)
77
- query = <<~SQL
78
- SELECT
79
- pg_attribute.attname,
80
- format_type(pg_attribute.atttypid, pg_attribute.atttypmod),
81
- pg_attribute.attnum,
82
- pg_index.indkey
83
- FROM
84
- pg_index, pg_class, pg_attribute, pg_namespace
85
- WHERE
86
- nspname = $1 AND
87
- relname = $2 AND
88
- indrelid = pg_class.oid AND
89
- pg_class.relnamespace = pg_namespace.oid AND
90
- pg_attribute.attrelid = pg_class.oid AND
91
- pg_attribute.attnum = any(pg_index.indkey) AND
92
- indisprimary
93
- SQL
94
- rows = execute(query, [table.schema, table.name])
95
- rows.sort_by { |r| r["indkey"].split(" ").index(r["attnum"]) }.map { |r| r["attname"] }
96
- end
97
-
98
71
  def triggers(table)
99
72
  query = <<~SQL
100
73
  SELECT
@@ -148,20 +121,34 @@ module PgSync
148
121
  end
149
122
 
150
123
  def execute(query, params = [])
124
+ log_sql query, params
151
125
  conn.exec_params(query, params).to_a
152
126
  end
153
127
 
154
128
  def transaction
155
129
  if conn.transaction_status == 0
156
130
  # not currently in transaction
157
- conn.transaction do
158
- yield
159
- end
131
+ log_sql "BEGIN"
132
+ result =
133
+ conn.transaction do
134
+ yield
135
+ end
136
+ log_sql "COMMIT"
137
+ result
160
138
  else
161
139
  yield
162
140
  end
163
141
  end
164
142
 
143
+ # TODO log time for each statement
144
+ def log_sql(query, params = {})
145
+ if @debug
146
+ message = "#{colorize("[#{@name}]", :cyan)} #{query.gsub(/\s+/, " ").strip}"
147
+ message = "#{message} #{params.inspect}" if params.any?
148
+ log message
149
+ end
150
+ end
151
+
165
152
  private
166
153
 
167
154
  def concurrent_id
@@ -62,7 +62,7 @@ module PgSync
62
62
  end
63
63
 
64
64
  def django?
65
- (File.read("manage.py") =~ /django/i) rescue false
65
+ file_exists?("manage.py", /django/i)
66
66
  end
67
67
 
68
68
  def heroku?
@@ -70,11 +70,21 @@ module PgSync
70
70
  end
71
71
 
72
72
  def laravel?
73
- File.exist?("artisan")
73
+ file_exists?("artisan")
74
74
  end
75
75
 
76
76
  def rails?
77
- File.exist?("bin/rails")
77
+ file_exists?("bin/rails")
78
+ end
79
+
80
+ def file_exists?(path, contents = nil)
81
+ if contents
82
+ File.read(path).match(contents)
83
+ else
84
+ File.exist?(path)
85
+ end
86
+ rescue
87
+ false
78
88
  end
79
89
  end
80
90
  end
@@ -0,0 +1,29 @@
1
+ # minimal class to keep schema and sequence name separate
2
+ module PgSync
3
+ class Sequence
4
+ attr_reader :schema, :name, :column
5
+
6
+ def initialize(schema, name, column:)
7
+ @schema = schema
8
+ @name = name
9
+ @column = column
10
+ end
11
+
12
+ def full_name
13
+ "#{schema}.#{name}"
14
+ end
15
+
16
+ def eql?(other)
17
+ other.schema == schema && other.name == name
18
+ end
19
+
20
+ # override hash when overriding eql?
21
+ def hash
22
+ [schema, name].hash
23
+ end
24
+
25
+ def to_s
26
+ full_name
27
+ end
28
+ end
29
+ end
@@ -34,13 +34,13 @@ module PgSync
34
34
  raise Error, "Danger! Add `to_safe: true` to `.pgsync.yml` if the destination is not localhost or 127.0.0.1"
35
35
  end
36
36
 
37
+ print_description("From", source)
38
+ print_description("To", destination)
39
+
37
40
  if (opts[:preserve] || opts[:overwrite]) && destination.server_version_num < 90500
38
41
  raise Error, "Postgres 9.5+ is required for --preserve and --overwrite"
39
42
  end
40
43
 
41
- print_description("From", source)
42
- print_description("To", destination)
43
-
44
44
  resolver = TaskResolver.new(args: args, opts: opts, source: source, destination: destination, config: config, first_schema: first_schema)
45
45
  tasks =
46
46
  resolver.tasks.map do |task|
@@ -126,15 +126,15 @@ module PgSync
126
126
  end
127
127
 
128
128
  def source
129
- @source ||= data_source(@options[:from])
129
+ @source ||= data_source(@options[:from], "from")
130
130
  end
131
131
 
132
132
  def destination
133
- @destination ||= data_source(@options[:to])
133
+ @destination ||= data_source(@options[:to], "to")
134
134
  end
135
135
 
136
- def data_source(url)
137
- ds = DataSource.new(url)
136
+ def data_source(url, name)
137
+ ds = DataSource.new(url, name: name, debug: @options[:debug])
138
138
  ObjectSpace.define_finalizer(self, self.class.finalize(ds))
139
139
  ds
140
140
  end
@@ -17,6 +17,10 @@ module PgSync
17
17
 
18
18
  add_columns
19
19
 
20
+ add_primary_keys
21
+
22
+ add_sequences unless opts[:no_sequences]
23
+
20
24
  show_notes
21
25
 
22
26
  # don't sync tables with no shared fields
@@ -24,8 +28,6 @@ module PgSync
24
28
  run_tasks(tasks.reject { |task| task.shared_fields.empty? })
25
29
  end
26
30
 
27
- # TODO only query specific tables
28
- # TODO add sequences, primary keys, etc
29
31
  def add_columns
30
32
  source_columns = columns(source)
31
33
  destination_columns = columns(destination)
@@ -36,6 +38,79 @@ module PgSync
36
38
  end
37
39
  end
38
40
 
41
+ def add_primary_keys
42
+ destination_primary_keys = primary_keys(destination)
43
+
44
+ tasks.each do |task|
45
+ task.to_primary_key = destination_primary_keys[task.table] || []
46
+ end
47
+ end
48
+
49
+ def add_sequences
50
+ source_sequences = sequences(source)
51
+ destination_sequences = sequences(destination)
52
+
53
+ tasks.each do |task|
54
+ shared_columns = Set.new(task.shared_fields)
55
+
56
+ task.from_sequences = (source_sequences[task.table] || []).select { |s| shared_columns.include?(s.column) }
57
+ task.to_sequences = (destination_sequences[task.table] || []).select { |s| shared_columns.include?(s.column) }
58
+ end
59
+ end
60
+
61
+ def sequences(data_source)
62
+ query = <<~SQL
63
+ SELECT
64
+ nt.nspname as schema,
65
+ t.relname as table,
66
+ a.attname as column,
67
+ n.nspname as sequence_schema,
68
+ s.relname as sequence
69
+ FROM
70
+ pg_class s
71
+ INNER JOIN
72
+ pg_depend d ON d.objid = s.oid
73
+ INNER JOIN
74
+ pg_class t ON d.objid = s.oid AND d.refobjid = t.oid
75
+ INNER JOIN
76
+ pg_attribute a ON (d.refobjid, d.refobjsubid) = (a.attrelid, a.attnum)
77
+ INNER JOIN
78
+ pg_namespace n ON n.oid = s.relnamespace
79
+ INNER JOIN
80
+ pg_namespace nt ON nt.oid = t.relnamespace
81
+ WHERE
82
+ s.relkind = 'S'
83
+ SQL
84
+ data_source.execute(query).group_by { |r| Table.new(r["schema"], r["table"]) }.map do |k, v|
85
+ [k, v.map { |r| Sequence.new(r["sequence_schema"], r["sequence"], column: r["column"]) }]
86
+ end.to_h
87
+ end
88
+
89
+ def primary_keys(data_source)
90
+ # https://stackoverflow.com/a/20537829
91
+ # TODO can simplify with array_position in Postgres 9.5+
92
+ query = <<~SQL
93
+ SELECT
94
+ nspname AS schema,
95
+ relname AS table,
96
+ pg_attribute.attname AS column,
97
+ format_type(pg_attribute.atttypid, pg_attribute.atttypmod),
98
+ pg_attribute.attnum,
99
+ pg_index.indkey
100
+ FROM
101
+ pg_index, pg_class, pg_attribute, pg_namespace
102
+ WHERE
103
+ indrelid = pg_class.oid AND
104
+ pg_class.relnamespace = pg_namespace.oid AND
105
+ pg_attribute.attrelid = pg_class.oid AND
106
+ pg_attribute.attnum = any(pg_index.indkey) AND
107
+ indisprimary
108
+ SQL
109
+ data_source.execute(query).group_by { |r| Table.new(r["schema"], r["table"]) }.map do |k, v|
110
+ [k, v.sort_by { |r| r["indkey"].split(" ").index(r["attnum"]) }.map { |r| r["column"] }]
111
+ end.to_h
112
+ end
113
+
39
114
  def show_notes
40
115
  # for tables
41
116
  resolver.notes.each do |note|
@@ -66,6 +141,8 @@ module PgSync
66
141
  data_type AS type
67
142
  FROM
68
143
  information_schema.columns
144
+ WHERE
145
+ is_generated = 'NEVER'
69
146
  ORDER BY 1, 2, 3
70
147
  SQL
71
148
  data_source.execute(query).group_by { |r| Table.new(r["schema"], r["table"]) }.map do |k, v|
@@ -93,28 +170,33 @@ module PgSync
93
170
  def run_tasks(tasks, &block)
94
171
  notices = []
95
172
  failed_tables = []
96
-
97
- spinners = TTY::Spinner::Multi.new(format: :dots, output: output)
98
- task_spinners = {}
99
173
  started_at = {}
100
174
 
175
+ show_spinners = output.tty? && !opts[:in_batches] && !opts[:debug]
176
+ if show_spinners
177
+ spinners = TTY::Spinner::Multi.new(format: :dots, output: output)
178
+ task_spinners = {}
179
+ end
180
+
101
181
  start = lambda do |task, i|
102
182
  message = ":spinner #{display_item(task)}"
103
- spinner = spinners.register(message)
104
- if opts[:in_batches]
105
- # log instead of spin for non-tty
106
- log message.sub(":spinner", "⠋")
107
- else
183
+
184
+ if show_spinners
185
+ spinner = spinners.register(message)
108
186
  spinner.auto_spin
187
+ task_spinners[task] = spinner
188
+ elsif opts[:in_batches]
189
+ log message.sub(":spinner", "⠋")
109
190
  end
110
- task_spinners[task] = spinner
191
+
111
192
  started_at[task] = Time.now
112
193
  end
113
194
 
114
195
  finish = lambda do |task, i, result|
115
- spinner = task_spinners[task]
116
196
  time = (Time.now - started_at[task]).round(1)
117
197
 
198
+ success = result[:status] == "success"
199
+
118
200
  message =
119
201
  if result[:message]
120
202
  "(#{result[:message].lines.first.to_s.strip})"
@@ -124,24 +206,31 @@ module PgSync
124
206
 
125
207
  notices.concat(result[:notices])
126
208
 
127
- if result[:status] == "success"
128
- spinner.success(message)
209
+ if show_spinners
210
+ spinner = task_spinners[task]
211
+ if success
212
+ spinner.success(message)
213
+ else
214
+ spinner.error(message)
215
+ end
129
216
  else
130
- spinner.error(message)
131
- failed_tables << task_name(task)
132
- fail_sync(failed_tables) if opts[:fail_fast]
217
+ status = success ? "✔" : "✖"
218
+ log [status, display_item(task), message].join(" ")
133
219
  end
134
220
 
135
- unless spinner.send(:tty?)
136
- status = result[:status] == "success" ? "✔" : "✖"
137
- log [status, display_item(task), message].join(" ")
221
+ unless success
222
+ failed_tables << task_name(task)
223
+ fail_sync(failed_tables) if opts[:fail_fast]
138
224
  end
139
225
  end
140
226
 
141
227
  options = {start: start, finish: finish}
142
228
 
143
229
  jobs = opts[:jobs]
144
- if opts[:debug] || opts[:in_batches] || opts[:defer_constraints]
230
+
231
+ # disable multiple jobs for defer constraints and disable integrity
232
+ # so we can use a transaction to ensure a consistent snapshot
233
+ if opts[:debug] || opts[:in_batches] || opts[:defer_constraints] || opts[:defer_constraints_v2] || opts[:disable_integrity] || opts[:disable_integrity_v2]
145
234
  warning "--jobs ignored" if jobs
146
235
  jobs = 0
147
236
  end
@@ -171,9 +260,25 @@ module PgSync
171
260
  fail_sync(failed_tables) if failed_tables.any?
172
261
  end
173
262
 
263
+ # TODO add option to open transaction on source when manually specifying order of tables
174
264
  def maybe_defer_constraints
175
- if opts[:defer_constraints]
265
+ if opts[:disable_integrity] || opts[:disable_integrity_v2]
266
+ # create a transaction on the source
267
+ # to ensure we get a consistent snapshot
268
+ source.transaction do
269
+ yield
270
+ end
271
+ elsif opts[:defer_constraints] || opts[:defer_constraints_v2]
176
272
  destination.transaction do
273
+ if opts[:defer_constraints_v2]
274
+ table_constraints = non_deferrable_constraints(destination)
275
+ table_constraints.each do |table, constraints|
276
+ constraints.each do |constraint|
277
+ destination.execute("ALTER TABLE #{quote_ident_full(table)} ALTER CONSTRAINT #{quote_ident(constraint)} DEFERRABLE")
278
+ end
279
+ end
280
+ end
281
+
177
282
  destination.execute("SET CONSTRAINTS ALL DEFERRED")
178
283
 
179
284
  # create a transaction on the source
@@ -181,6 +286,20 @@ module PgSync
181
286
  source.transaction do
182
287
  yield
183
288
  end
289
+
290
+ # set them back
291
+ # there are 3 modes: DEFERRABLE INITIALLY DEFERRED, DEFERRABLE INITIALLY IMMEDIATE, and NOT DEFERRABLE
292
+ # we only update NOT DEFERRABLE
293
+ # https://www.postgresql.org/docs/current/sql-set-constraints.html
294
+ if opts[:defer_constraints_v2]
295
+ destination.execute("SET CONSTRAINTS ALL IMMEDIATE")
296
+
297
+ table_constraints.each do |table, constraints|
298
+ constraints.each do |constraint|
299
+ destination.execute("ALTER TABLE #{quote_ident_full(table)} ALTER CONSTRAINT #{quote_ident(constraint)} NOT DEFERRABLE")
300
+ end
301
+ end
302
+ end
184
303
  end
185
304
  else
186
305
  yield
@@ -3,7 +3,7 @@ module PgSync
3
3
  include Utils
4
4
 
5
5
  attr_reader :source, :destination, :config, :table, :opts
6
- attr_accessor :from_columns, :to_columns
6
+ attr_accessor :from_columns, :to_columns, :from_sequences, :to_sequences, :to_primary_key
7
7
 
8
8
  def initialize(source:, destination:, config:, table:, opts:)
9
9
  @source = source
@@ -11,6 +11,8 @@ module PgSync
11
11
  @config = config
12
12
  @table = table
13
13
  @opts = opts
14
+ @from_sequences = []
15
+ @to_sequences = []
14
16
  end
15
17
 
16
18
  def quoted_table
@@ -39,14 +41,6 @@ module PgSync
39
41
  @shared_fields ||= to_fields & from_fields
40
42
  end
41
43
 
42
- def from_sequences
43
- @from_sequences ||= opts[:no_sequences] ? [] : source.sequences(table, shared_fields)
44
- end
45
-
46
- def to_sequences
47
- @to_sequences ||= opts[:no_sequences] ? [] : destination.sequences(table, shared_fields)
48
- end
49
-
50
44
  def shared_sequences
51
45
  @shared_sequences ||= to_sequences & from_sequences
52
46
  end
@@ -88,15 +82,10 @@ module PgSync
88
82
  sql_clause << " #{opts[:sql]}" if opts[:sql]
89
83
 
90
84
  bad_fields = opts[:no_rules] ? [] : config["data_rules"]
91
- primary_key = destination.primary_key(table)
85
+ primary_key = to_primary_key
92
86
  copy_fields = shared_fields.map { |f| f2 = bad_fields.to_a.find { |bf, _| rule_match?(table, f, bf) }; f2 ? "#{apply_strategy(f2[1], table, f, primary_key)} AS #{quote_ident(f)}" : "#{quoted_table}.#{quote_ident(f)}" }.join(", ")
93
87
  fields = shared_fields.map { |f| quote_ident(f) }.join(", ")
94
88
 
95
- seq_values = {}
96
- shared_sequences.each do |seq|
97
- seq_values[seq] = source.last_value(seq)
98
- end
99
-
100
89
  copy_to_command = "COPY (SELECT #{copy_fields} FROM #{quoted_table}#{sql_clause}) TO STDOUT"
101
90
  if opts[:in_batches]
102
91
  raise Error, "No primary key" if primary_key.empty?
@@ -151,20 +140,27 @@ module PgSync
151
140
  "NOTHING"
152
141
  else # overwrite or sql clause
153
142
  setter = shared_fields.reject { |f| primary_key.include?(f) }.map { |f| "#{quote_ident(f)} = EXCLUDED.#{quote_ident(f)}" }
154
- "UPDATE SET #{setter.join(", ")}"
143
+ if setter.any?
144
+ "UPDATE SET #{setter.join(", ")}"
145
+ else
146
+ "NOTHING"
147
+ end
155
148
  end
156
- destination.execute("INSERT INTO #{quoted_table} (SELECT * FROM #{quote_ident_full(temp_table)}) ON CONFLICT (#{on_conflict}) DO #{action}")
149
+ destination.execute("INSERT INTO #{quoted_table} (#{fields}) (SELECT #{fields} FROM #{quote_ident_full(temp_table)}) ON CONFLICT (#{on_conflict}) DO #{action}")
157
150
  else
158
151
  # use delete instead of truncate for foreign keys
159
- if opts[:defer_constraints]
152
+ if opts[:defer_constraints] || opts[:defer_constraints_v2]
160
153
  destination.execute("DELETE FROM #{quoted_table}")
161
154
  else
162
155
  destination.truncate(table)
163
156
  end
164
157
  copy(copy_to_command, dest_table: table, dest_fields: fields)
165
158
  end
166
- seq_values.each do |seq, value|
167
- destination.execute("SELECT setval(#{escape(seq)}, #{escape(value)})")
159
+
160
+ # update sequences
161
+ shared_sequences.each do |seq|
162
+ value = source.last_value(seq)
163
+ destination.execute("SELECT setval(#{escape(quote_ident_full(seq))}, #{escape(value)})")
168
164
  end
169
165
 
170
166
  {status: "success"}
@@ -214,6 +210,10 @@ module PgSync
214
210
 
215
211
  def copy(source_command, dest_table:, dest_fields:)
216
212
  destination_command = "COPY #{quote_ident_full(dest_table)} (#{dest_fields}) FROM STDIN"
213
+
214
+ source.log_sql(source_command)
215
+ destination.log_sql(destination_command)
216
+
217
217
  destination.conn.copy_data(destination_command) do
218
218
  source.conn.copy_data(source_command) do
219
219
  while (row = source.conn.get_copy_data)
@@ -275,7 +275,7 @@ module PgSync
275
275
  end
276
276
 
277
277
  def maybe_disable_triggers
278
- if opts[:disable_integrity] || opts[:disable_user_triggers]
278
+ if opts[:disable_integrity] || opts[:disable_integrity_v2] || opts[:disable_user_triggers]
279
279
  destination.transaction do
280
280
  triggers = destination.triggers(table)
281
281
  triggers.select! { |t| t["enabled"] == "t" }
@@ -283,7 +283,17 @@ module PgSync
283
283
  integrity_triggers = internal_triggers.select { |t| t["integrity"] == "t" }
284
284
  restore_triggers = []
285
285
 
286
- if opts[:disable_integrity]
286
+ # both --disable-integrity options require superuser privileges
287
+ # however, only v2 works on Amazon RDS, which added specific support for it
288
+ # https://aws.amazon.com/about-aws/whats-new/2014/11/10/amazon-rds-postgresql-read-replicas/
289
+ #
290
+ # session_replication_role disables more than foreign keys (like triggers and rules)
291
+ # this is probably fine, but keep the current default for now
292
+ if opts[:disable_integrity_v2] || (opts[:disable_integrity] && rds?)
293
+ # SET LOCAL lasts until the end of the transaction
294
+ # https://www.postgresql.org/docs/current/sql-set.html
295
+ destination.execute("SET LOCAL session_replication_role = replica")
296
+ elsif opts[:disable_integrity]
287
297
  integrity_triggers.each do |trigger|
288
298
  destination.execute("ALTER TABLE #{quoted_table} DISABLE TRIGGER #{quote_ident(trigger["name"])}")
289
299
  end
@@ -311,5 +321,9 @@ module PgSync
311
321
  yield
312
322
  end
313
323
  end
324
+
325
+ def rds?
326
+ destination.execute("SELECT name, setting FROM pg_settings WHERE name LIKE 'rds.%'").any?
327
+ end
314
328
  end
315
329
  end
@@ -148,7 +148,7 @@ module PgSync
148
148
  regex = Regexp.new('\A' + Regexp.escape(value).gsub('\*','[^\.]*') + '\z')
149
149
  tables.reject! { |t| regex.match(t.full_name) || regex.match(t.name) }
150
150
  else
151
- tables -= [fully_resolve(to_table(value))]
151
+ tables -= [fully_resolve(to_table(value), error: false)].compact
152
152
  end
153
153
  end
154
154
 
@@ -181,9 +181,11 @@ module PgSync
181
181
  end
182
182
 
183
183
  # for tables without a schema, find the table in the search path
184
- def fully_resolve(table)
184
+ def fully_resolve(table, error: true)
185
185
  return table if table.schema
186
- no_schema_tables[table.name] || (raise Error, "Table not found in source: #{table.name}")
186
+ resolved_table = no_schema_tables[table.name]
187
+ raise Error, "Table not found in source: #{table.name}" if !resolved_table && error
188
+ resolved_table
187
189
  end
188
190
 
189
191
  # parse command line arguments and YAML
@@ -3,7 +3,8 @@ module PgSync
3
3
  COLOR_CODES = {
4
4
  red: 31,
5
5
  green: 32,
6
- yellow: 33
6
+ yellow: 33,
7
+ cyan: 36
7
8
  }
8
9
 
9
10
  def log(message = nil)
@@ -59,7 +60,7 @@ module PgSync
59
60
  end
60
61
 
61
62
  def quote_ident_full(ident)
62
- if ident.is_a?(Table)
63
+ if ident.is_a?(Table) || ident.is_a?(Sequence)
63
64
  [quote_ident(ident.schema), quote_ident(ident.name)].join(".")
64
65
  else # temp table names are strings
65
66
  quote_ident(ident)
@@ -1,3 +1,3 @@
1
1
  module PgSync
2
- VERSION = "0.6.1"
2
+ VERSION = "0.6.6"
3
3
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: pgsync
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.6.1
4
+ version: 0.6.6
5
5
  platform: ruby
6
6
  authors:
7
7
  - Andrew Kane
8
- autorequire:
8
+ autorequire:
9
9
  bindir: exe
10
10
  cert_chain: []
11
- date: 2020-06-08 00:00:00.000000000 Z
11
+ date: 2020-10-30 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: parallel
@@ -44,14 +44,14 @@ dependencies:
44
44
  requirements:
45
45
  - - ">="
46
46
  - !ruby/object:Gem::Version
47
- version: 4.8.1
47
+ version: 4.8.2
48
48
  type: :runtime
49
49
  prerelease: false
50
50
  version_requirements: !ruby/object:Gem::Requirement
51
51
  requirements:
52
52
  - - ">="
53
53
  - !ruby/object:Gem::Version
54
- version: 4.8.1
54
+ version: 4.8.2
55
55
  - !ruby/object:Gem::Dependency
56
56
  name: tty-spinner
57
57
  requirement: !ruby/object:Gem::Requirement
@@ -108,7 +108,7 @@ dependencies:
108
108
  - - ">="
109
109
  - !ruby/object:Gem::Version
110
110
  version: '0'
111
- description:
111
+ description:
112
112
  email: andrew@chartkick.com
113
113
  executables:
114
114
  - pgsync
@@ -125,6 +125,7 @@ files:
125
125
  - lib/pgsync/data_source.rb
126
126
  - lib/pgsync/init.rb
127
127
  - lib/pgsync/schema_sync.rb
128
+ - lib/pgsync/sequence.rb
128
129
  - lib/pgsync/sync.rb
129
130
  - lib/pgsync/table.rb
130
131
  - lib/pgsync/table_sync.rb
@@ -136,7 +137,7 @@ homepage: https://github.com/ankane/pgsync
136
137
  licenses:
137
138
  - MIT
138
139
  metadata: {}
139
- post_install_message:
140
+ post_install_message:
140
141
  rdoc_options: []
141
142
  require_paths:
142
143
  - lib
@@ -151,8 +152,8 @@ required_rubygems_version: !ruby/object:Gem::Requirement
151
152
  - !ruby/object:Gem::Version
152
153
  version: '0'
153
154
  requirements: []
154
- rubygems_version: 3.1.2
155
- signing_key:
155
+ rubygems_version: 3.1.4
156
+ signing_key:
156
157
  specification_version: 4
157
158
  summary: Sync Postgres data between databases
158
159
  test_files: []