pgsync 0.6.0 → 0.6.5

Sign up to get free protection for your applications and to get access to all the features.

Potentially problematic release.


This version of pgsync might be problematic. Click here for more details.

checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: f6c20cf99ebc0961d8699228883f6c2e9c95c9798cfb6f101cb78803df6b1b43
4
- data.tar.gz: 7c719442c07e6db1d4704199aa2b95d32e1ba528797b2d21b15be85f2de71d79
3
+ metadata.gz: e3ef4779b0d0962cfb0a23bb8e1ec4e6a03923e1324a77280499d3c023aa7673
4
+ data.tar.gz: 59aa2f0868e3a1a920547326a49b86460ab2d8349eb0be68989899ca3bf7ada2
5
5
  SHA512:
6
- metadata.gz: 1e3d735e6e3002e2fcb74de574f63f534472821aca0154c13d9d7ce6e9fa94426239cef875cd33724fec5a6cf2567b5656a80a9d92f89d9678ee6bee2c2ec70c
7
- data.tar.gz: 3ab5d416f4c2364644c015d106631898a3615cc671179f8b1193cda5b1e894512fb39906ec1c27570ca9fbd186c9eb19af3afb88eba093bb21e8bc471c28f18b
6
+ metadata.gz: 9c6761c05533930479d39fcc318bd1c2601afa06ba50379c94bdaa69e3751fa3229559aa31806778cf17bdf79af0510bdbcb5d0c33dd6e4fdb1e164cda9ea49f
7
+ data.tar.gz: f8509b918137d97f80044ecac181a9499c128891d33467d91f2f74e6e49248f2ebbacee8bf6233c959c923d583ef8e14cd76b25d19b6b58c0d4166a7cc81af56
@@ -1,3 +1,26 @@
1
+ ## 0.6.5 (2020-07-10)
2
+
3
+ - Improved help
4
+
5
+ ## 0.6.4 (2020-06-10)
6
+
7
+ - Log SQL with `--debug` option
8
+ - Improved sequence queries
9
+
10
+ ## 0.6.3 (2020-06-09)
11
+
12
+ - Added `--defer-constraints-v2` option
13
+ - Ensure consistent source snapshot with `--disable-integrity`
14
+
15
+ ## 0.6.2 (2020-06-09)
16
+
17
+ - Added support for `--disable-integrity` on Amazon RDS
18
+ - Fixed error when excluded table not found in source
19
+
20
+ ## 0.6.1 (2020-06-07)
21
+
22
+ - Added Django and Laravel integrations
23
+
1
24
  ## 0.6.0 (2020-06-07)
2
25
 
3
26
  - Added messages for different column types and non-deferrable constraints
data/README.md CHANGED
@@ -35,7 +35,7 @@ This creates `.pgsync.yml` for you to customize. We recommend checking this into
35
35
 
36
36
  First, make sure your schema is set up in both databases. We recommend using a schema migration tool for this, but pgsync also provides a few [convenience methods](#schema). Once that’s done, you’re ready to sync data.
37
37
 
38
- Sync all tables
38
+ Sync tables
39
39
 
40
40
  ```sh
41
41
  pgsync
@@ -198,20 +198,20 @@ Rules starting with `unique_` require the table to have a single column primary
198
198
 
199
199
  Foreign keys can make it difficult to sync data. Three options are:
200
200
 
201
- 1. Manually specify the order of tables
202
- 2. Use deferrable constraints
203
- 3. Disable foreign key triggers, which can silently break referential integrity
201
+ 1. Defer constraints (recommended)
202
+ 2. Manually specify the order of tables
203
+ 3. Disable foreign key triggers, which can silently break referential integrity (not recommended)
204
204
 
205
- When manually specifying the order, use `--jobs 1` so tables are synced one-at-a-time.
205
+ To defer constraints, use:
206
206
 
207
207
  ```sh
208
- pgsync table1,table2,table3 --jobs 1
208
+ pgsync --defer-constraints-v2
209
209
  ```
210
210
 
211
- If your tables have [deferrable constraints](https://begriffs.com/posts/2017-08-27-deferrable-sql-constraints.html), use:
211
+ To manually specify the order of tables, use `--jobs 1` so tables are synced one-at-a-time.
212
212
 
213
213
  ```sh
214
- pgsync --defer-constraints
214
+ pgsync table1,table2,table3 --jobs 1
215
215
  ```
216
216
 
217
217
  To disable foreign key triggers and potentially break referential integrity, use:
@@ -220,6 +220,8 @@ To disable foreign key triggers and potentially break referential integrity, use
220
220
  pgsync --disable-integrity
221
221
  ```
222
222
 
223
+ This requires superuser privileges on the `to` database. If syncing to (not from) Amazon RDS, use the `rds_superuser` role. If syncing to (not from) Heroku, there doesn’t appear to be a way to disable integrity.
224
+
223
225
  ## Triggers
224
226
 
225
227
  Disable user triggers with:
@@ -262,6 +264,57 @@ This creates `.pgsync-db2.yml` for you to edit. Specify a database in commands w
262
264
  pgsync --db db2
263
265
  ```
264
266
 
267
+ ## Integrations
268
+
269
+ - [Django](#django)
270
+ - [Heroku](#heroku)
271
+ - [Laravel](#laravel)
272
+ - [Rails](#rails)
273
+
274
+ ### Django
275
+
276
+ If you run `pgsync --init` in a Django project, migrations will be excluded in `.pgsync.yml`.
277
+
278
+ ```yml
279
+ exclude:
280
+ - django_migrations
281
+ ```
282
+
283
+ ### Heroku
284
+
285
+ If you run `pgsync --init` in a Heroku project, the `from` database will be set in `.pgsync.yml`.
286
+
287
+ ```yml
288
+ from: $(heroku config:get DATABASE_URL)?sslmode=require
289
+ ```
290
+
291
+ ### Laravel
292
+
293
+ If you run `pgsync --init` in a Laravel project, migrations will be excluded in `.pgsync.yml`.
294
+
295
+ ```yml
296
+ exclude:
297
+ - migrations
298
+ ```
299
+
300
+ ### Rails
301
+
302
+ If you run `pgsync --init` in a Rails project, Active Record metadata and schema migrations will be excluded in `.pgsync.yml`.
303
+
304
+ ```yml
305
+ exclude:
306
+ - ar_internal_metadata
307
+ - schema_migrations
308
+ ```
309
+
310
+ ## Debugging
311
+
312
+ To view the SQL that’s run, use:
313
+
314
+ ```sh
315
+ pgsync --debug
316
+ ```
317
+
265
318
  ## Other Commands
266
319
 
267
320
  Help
@@ -337,6 +390,10 @@ Also check out:
337
390
 
338
391
  Inspired by [heroku-pg-transfer](https://github.com/ddollar/heroku-pg-transfer).
339
392
 
393
+ ## History
394
+
395
+ View the [changelog](https://github.com/ankane/pgsync/blob/master/CHANGELOG.md)
396
+
340
397
  ## Contributing
341
398
 
342
399
  Everyone is encouraged to help improve this project. Here are a few ways you can help:
@@ -18,6 +18,7 @@ require "pgsync/client"
18
18
  require "pgsync/data_source"
19
19
  require "pgsync/init"
20
20
  require "pgsync/schema_sync"
21
+ require "pgsync/sequence"
21
22
  require "pgsync/sync"
22
23
  require "pgsync/table"
23
24
  require "pgsync/table_sync"
@@ -39,39 +39,74 @@ module PgSync
39
39
  def slop_options
40
40
  o = Slop::Options.new
41
41
  o.banner = %{Usage:
42
- pgsync [options]
42
+ pgsync [tables,groups] [sql] [options]}
43
43
 
44
- Options:}
45
- o.string "-d", "--db", "database"
46
- o.string "-t", "--tables", "tables to sync"
47
- o.string "-g", "--groups", "groups to sync"
48
- o.integer "-j", "--jobs", "number of tables to sync at a time"
44
+ # not shown
45
+ o.string "-t", "--tables", "tables to sync", help: false
46
+ o.string "-g", "--groups", "groups to sync", help: false
47
+
48
+ o.separator ""
49
+ o.separator "Table options:"
50
+ o.string "--exclude", "tables to exclude"
49
51
  o.string "--schemas", "schemas to sync"
50
- o.string "--from", "source"
51
- o.string "--to", "destination"
52
- o.string "--exclude", "exclude tables"
53
- o.string "--config", "config file"
54
- o.boolean "--to-safe", "accept danger", default: false
55
- o.boolean "--debug", "debug", default: false
56
- o.boolean "--list", "list", default: false
57
- o.boolean "--overwrite", "overwrite existing rows", default: false, help: false
52
+ o.boolean "--all-schemas", "sync all schemas", default: false
53
+
54
+ o.separator ""
55
+ o.separator "Row options:"
56
+ o.boolean "--overwrite", "overwrite existing rows", default: false
58
57
  o.boolean "--preserve", "preserve existing rows", default: false
59
58
  o.boolean "--truncate", "truncate existing rows", default: false
60
- o.boolean "--schema-first", "schema first", default: false
61
- o.boolean "--schema-only", "schema only", default: false
62
- o.boolean "--all-schemas", "all schemas", default: false
63
- o.boolean "--no-rules", "do not apply data rules", default: false
64
- o.boolean "--no-sequences", "do not sync sequences", default: false
65
- o.boolean "--init", "init", default: false
66
- o.boolean "--in-batches", "in batches", default: false, help: false
67
- o.integer "--batch-size", "batch size", default: 10000, help: false
68
- o.float "--sleep", "sleep", default: 0, help: false
69
- o.boolean "--fail-fast", "stop on the first failed table", default: false
70
- o.boolean "--defer-constraints", "defer constraints", default: false
71
- o.boolean "--disable-user-triggers", "disable non-system triggers", default: false
59
+
60
+ o.separator ""
61
+ o.separator "Foreign key options:"
62
+ o.boolean "--defer-constraints-v2", "defer constraints", default: false
72
63
  o.boolean "--disable-integrity", "disable foreign key triggers", default: false
73
- o.boolean "-v", "--version", "print the version"
74
- o.boolean "-h", "--help", "prints help"
64
+ o.integer "-j", "--jobs", "number of tables to sync at a time"
65
+
66
+ # replaced by v2
67
+ o.boolean "--defer-constraints", "defer constraints", default: false, help: false
68
+ # private, for testing
69
+ o.boolean "--disable-integrity-v2", "disable foreign key triggers", default: false, help: false
70
+
71
+ o.separator ""
72
+ o.separator "Schema options:"
73
+ o.boolean "--schema-first", "sync schema first", default: false
74
+ o.boolean "--schema-only", "sync schema only", default: false
75
+
76
+ o.separator ""
77
+ o.separator "Config options:"
78
+ # technically, defaults to searching path for .pgsync.yml, but this is simpler
79
+ o.string "--config", "config file (defaults to .pgsync.yml)"
80
+ o.string "-d", "--db", "database-specific config file"
81
+
82
+ o.separator ""
83
+ o.separator "Connection options:"
84
+ o.string "--from", "source database URL"
85
+ o.string "--to", "destination database URL"
86
+ o.boolean "--to-safe", "confirms destination is safe (when not localhost)", default: false
87
+
88
+ o.separator ""
89
+ o.separator "Other options:"
90
+ o.boolean "--debug", "show SQL statements", default: false
91
+ o.boolean "--disable-user-triggers", "disable non-system triggers", default: false
92
+ o.boolean "--fail-fast", "stop on the first failed table", default: false
93
+ o.boolean "--no-rules", "don't apply data rules", default: false
94
+ o.boolean "--no-sequences", "don't sync sequences", default: false
95
+
96
+ # not shown in help
97
+ # o.separator ""
98
+ # o.separator "Append-only table options:"
99
+ o.boolean "--in-batches", "sync in batches", default: false, help: false
100
+ o.integer "--batch-size", "batch size", default: 10000, help: false
101
+ o.float "--sleep", "time to sleep between batches", default: 0, help: false
102
+
103
+ o.separator ""
104
+ o.separator "Other commands:"
105
+ o.boolean "--init", "create config file", default: false
106
+ o.boolean "--list", "list tables", default: false
107
+ o.boolean "-h", "--help", "print help"
108
+ o.boolean "-v", "--version", "print version"
109
+
75
110
  o
76
111
  end
77
112
  end
@@ -4,8 +4,10 @@ module PgSync
4
4
 
5
5
  attr_reader :url
6
6
 
7
- def initialize(url)
7
+ def initialize(url, name:, debug:)
8
8
  @url = url
9
+ @name = name
10
+ @debug = debug
9
11
  end
10
12
 
11
13
  def exists?
@@ -50,10 +52,6 @@ module PgSync
50
52
  table_set.include?(table)
51
53
  end
52
54
 
53
- def sequences(table, columns)
54
- execute("SELECT #{columns.map { |f| "pg_get_serial_sequence(#{escape("#{quote_ident_full(table)}")}, #{escape(f)}) AS #{quote_ident(f)}" }.join(", ")}").first.values.compact
55
- end
56
-
57
55
  def max_id(table, primary_key, sql_clause = nil)
58
56
  execute("SELECT MAX(#{quote_ident(primary_key)}) FROM #{quote_ident_full(table)}#{sql_clause}").first["max"].to_i
59
57
  end
@@ -62,39 +60,14 @@ module PgSync
62
60
  execute("SELECT MIN(#{quote_ident(primary_key)}) FROM #{quote_ident_full(table)}#{sql_clause}").first["min"].to_i
63
61
  end
64
62
 
65
- # this value comes from pg_get_serial_sequence which is already quoted
66
63
  def last_value(seq)
67
- execute("SELECT last_value FROM #{seq}").first["last_value"]
64
+ execute("SELECT last_value FROM #{quote_ident_full(seq)}").first["last_value"]
68
65
  end
69
66
 
70
67
  def truncate(table)
71
68
  execute("TRUNCATE #{quote_ident_full(table)} CASCADE")
72
69
  end
73
70
 
74
- # https://stackoverflow.com/a/20537829
75
- # TODO can simplify with array_position in Postgres 9.5+
76
- def primary_key(table)
77
- query = <<~SQL
78
- SELECT
79
- pg_attribute.attname,
80
- format_type(pg_attribute.atttypid, pg_attribute.atttypmod),
81
- pg_attribute.attnum,
82
- pg_index.indkey
83
- FROM
84
- pg_index, pg_class, pg_attribute, pg_namespace
85
- WHERE
86
- nspname = $1 AND
87
- relname = $2 AND
88
- indrelid = pg_class.oid AND
89
- pg_class.relnamespace = pg_namespace.oid AND
90
- pg_attribute.attrelid = pg_class.oid AND
91
- pg_attribute.attnum = any(pg_index.indkey) AND
92
- indisprimary
93
- SQL
94
- rows = execute(query, [table.schema, table.name])
95
- rows.sort_by { |r| r["indkey"].split(" ").index(r["attnum"]) }.map { |r| r["attname"] }
96
- end
97
-
98
71
  def triggers(table)
99
72
  query = <<~SQL
100
73
  SELECT
@@ -148,20 +121,34 @@ module PgSync
148
121
  end
149
122
 
150
123
  def execute(query, params = [])
124
+ log_sql query, params
151
125
  conn.exec_params(query, params).to_a
152
126
  end
153
127
 
154
128
  def transaction
155
129
  if conn.transaction_status == 0
156
130
  # not currently in transaction
157
- conn.transaction do
158
- yield
159
- end
131
+ log_sql "BEGIN"
132
+ result =
133
+ conn.transaction do
134
+ yield
135
+ end
136
+ log_sql "COMMIT"
137
+ result
160
138
  else
161
139
  yield
162
140
  end
163
141
  end
164
142
 
143
+ # TODO log time for each statement
144
+ def log_sql(query, params = {})
145
+ if @debug
146
+ message = "#{colorize("[#{@name}]", :cyan)} #{query.gsub(/\s+/, " ").strip}"
147
+ message = "#{message} #{params.inspect}" if params.any?
148
+ log message
149
+ end
150
+ end
151
+
165
152
  private
166
153
 
167
154
  def concurrent_id
@@ -30,8 +30,19 @@ module PgSync
30
30
  if rails?
31
31
  <<~EOS
32
32
  exclude:
33
- - schema_migrations
34
33
  - ar_internal_metadata
34
+ - schema_migrations
35
+ EOS
36
+ elsif django?
37
+ # TODO exclude other tables?
38
+ <<~EOS
39
+ exclude:
40
+ - django_migrations
41
+ EOS
42
+ elsif laravel?
43
+ <<~EOS
44
+ exclude:
45
+ - migrations
35
46
  EOS
36
47
  else
37
48
  <<~EOS
@@ -50,12 +61,30 @@ module PgSync
50
61
  end
51
62
  end
52
63
 
64
+ def django?
65
+ file_exists?("manage.py", /django/i)
66
+ end
67
+
53
68
  def heroku?
54
69
  `git remote -v 2>&1`.include?("git.heroku.com") rescue false
55
70
  end
56
71
 
72
+ def laravel?
73
+ file_exists?("artisan")
74
+ end
75
+
57
76
  def rails?
58
- File.exist?("bin/rails")
77
+ file_exists?("bin/rails")
78
+ end
79
+
80
+ def file_exists?(path, contents = nil)
81
+ if contents
82
+ File.read(path).match(contents)
83
+ else
84
+ File.exist?(path)
85
+ end
86
+ rescue
87
+ false
59
88
  end
60
89
  end
61
90
  end
@@ -0,0 +1,29 @@
1
+ # minimal class to keep schema and sequence name separate
2
+ module PgSync
3
+ class Sequence
4
+ attr_reader :schema, :name, :column
5
+
6
+ def initialize(schema, name, column:)
7
+ @schema = schema
8
+ @name = name
9
+ @column = column
10
+ end
11
+
12
+ def full_name
13
+ "#{schema}.#{name}"
14
+ end
15
+
16
+ def eql?(other)
17
+ other.schema == schema && other.name == name
18
+ end
19
+
20
+ # override hash when overriding eql?
21
+ def hash
22
+ [schema, name].hash
23
+ end
24
+
25
+ def to_s
26
+ full_name
27
+ end
28
+ end
29
+ end
@@ -34,13 +34,13 @@ module PgSync
34
34
  raise Error, "Danger! Add `to_safe: true` to `.pgsync.yml` if the destination is not localhost or 127.0.0.1"
35
35
  end
36
36
 
37
+ print_description("From", source)
38
+ print_description("To", destination)
39
+
37
40
  if (opts[:preserve] || opts[:overwrite]) && destination.server_version_num < 90500
38
41
  raise Error, "Postgres 9.5+ is required for --preserve and --overwrite"
39
42
  end
40
43
 
41
- print_description("From", source)
42
- print_description("To", destination)
43
-
44
44
  resolver = TaskResolver.new(args: args, opts: opts, source: source, destination: destination, config: config, first_schema: first_schema)
45
45
  tasks =
46
46
  resolver.tasks.map do |task|
@@ -126,19 +126,20 @@ module PgSync
126
126
  end
127
127
 
128
128
  def source
129
- @source ||= data_source(@options[:from])
129
+ @source ||= data_source(@options[:from], "from")
130
130
  end
131
131
 
132
132
  def destination
133
- @destination ||= data_source(@options[:to])
133
+ @destination ||= data_source(@options[:to], "to")
134
134
  end
135
135
 
136
- def data_source(url)
137
- ds = DataSource.new(url)
136
+ def data_source(url, name)
137
+ ds = DataSource.new(url, name: name, debug: @options[:debug])
138
138
  ObjectSpace.define_finalizer(self, self.class.finalize(ds))
139
139
  ds
140
140
  end
141
141
 
142
+ # ideally aliases would work, but haven't found a nice way to do this
142
143
  def resolve_source(source)
143
144
  if source
144
145
  source = source.dup
@@ -17,6 +17,10 @@ module PgSync
17
17
 
18
18
  add_columns
19
19
 
20
+ add_primary_keys
21
+
22
+ add_sequences unless opts[:no_sequences]
23
+
20
24
  show_notes
21
25
 
22
26
  # don't sync tables with no shared fields
@@ -24,8 +28,6 @@ module PgSync
24
28
  run_tasks(tasks.reject { |task| task.shared_fields.empty? })
25
29
  end
26
30
 
27
- # TODO only query specific tables
28
- # TODO add sequences, primary keys, etc
29
31
  def add_columns
30
32
  source_columns = columns(source)
31
33
  destination_columns = columns(destination)
@@ -36,6 +38,79 @@ module PgSync
36
38
  end
37
39
  end
38
40
 
41
+ def add_primary_keys
42
+ destination_primary_keys = primary_keys(destination)
43
+
44
+ tasks.each do |task|
45
+ task.to_primary_key = destination_primary_keys[task.table] || []
46
+ end
47
+ end
48
+
49
+ def add_sequences
50
+ source_sequences = sequences(source)
51
+ destination_sequences = sequences(destination)
52
+
53
+ tasks.each do |task|
54
+ shared_columns = Set.new(task.shared_fields)
55
+
56
+ task.from_sequences = (source_sequences[task.table] || []).select { |s| shared_columns.include?(s.column) }
57
+ task.to_sequences = (destination_sequences[task.table] || []).select { |s| shared_columns.include?(s.column) }
58
+ end
59
+ end
60
+
61
+ def sequences(data_source)
62
+ query = <<~SQL
63
+ SELECT
64
+ nt.nspname as schema,
65
+ t.relname as table,
66
+ a.attname as column,
67
+ n.nspname as sequence_schema,
68
+ s.relname as sequence
69
+ FROM
70
+ pg_class s
71
+ INNER JOIN
72
+ pg_depend d ON d.objid = s.oid
73
+ INNER JOIN
74
+ pg_class t ON d.objid = s.oid AND d.refobjid = t.oid
75
+ INNER JOIN
76
+ pg_attribute a ON (d.refobjid, d.refobjsubid) = (a.attrelid, a.attnum)
77
+ INNER JOIN
78
+ pg_namespace n ON n.oid = s.relnamespace
79
+ INNER JOIN
80
+ pg_namespace nt ON nt.oid = t.relnamespace
81
+ WHERE
82
+ s.relkind = 'S'
83
+ SQL
84
+ data_source.execute(query).group_by { |r| Table.new(r["schema"], r["table"]) }.map do |k, v|
85
+ [k, v.map { |r| Sequence.new(r["sequence_schema"], r["sequence"], column: r["column"]) }]
86
+ end.to_h
87
+ end
88
+
89
+ def primary_keys(data_source)
90
+ # https://stackoverflow.com/a/20537829
91
+ # TODO can simplify with array_position in Postgres 9.5+
92
+ query = <<~SQL
93
+ SELECT
94
+ nspname AS schema,
95
+ relname AS table,
96
+ pg_attribute.attname AS column,
97
+ format_type(pg_attribute.atttypid, pg_attribute.atttypmod),
98
+ pg_attribute.attnum,
99
+ pg_index.indkey
100
+ FROM
101
+ pg_index, pg_class, pg_attribute, pg_namespace
102
+ WHERE
103
+ indrelid = pg_class.oid AND
104
+ pg_class.relnamespace = pg_namespace.oid AND
105
+ pg_attribute.attrelid = pg_class.oid AND
106
+ pg_attribute.attnum = any(pg_index.indkey) AND
107
+ indisprimary
108
+ SQL
109
+ data_source.execute(query).group_by { |r| Table.new(r["schema"], r["table"]) }.map do |k, v|
110
+ [k, v.sort_by { |r| r["indkey"].split(" ").index(r["attnum"]) }.map { |r| r["column"] }]
111
+ end.to_h
112
+ end
113
+
39
114
  def show_notes
40
115
  # for tables
41
116
  resolver.notes.each do |note|
@@ -93,28 +168,33 @@ module PgSync
93
168
  def run_tasks(tasks, &block)
94
169
  notices = []
95
170
  failed_tables = []
96
-
97
- spinners = TTY::Spinner::Multi.new(format: :dots, output: output)
98
- task_spinners = {}
99
171
  started_at = {}
100
172
 
173
+ show_spinners = output.tty? && !opts[:in_batches] && !opts[:debug]
174
+ if show_spinners
175
+ spinners = TTY::Spinner::Multi.new(format: :dots, output: output)
176
+ task_spinners = {}
177
+ end
178
+
101
179
  start = lambda do |task, i|
102
180
  message = ":spinner #{display_item(task)}"
103
- spinner = spinners.register(message)
104
- if opts[:in_batches]
105
- # log instead of spin for non-tty
106
- log message.sub(":spinner", "⠋")
107
- else
181
+
182
+ if show_spinners
183
+ spinner = spinners.register(message)
108
184
  spinner.auto_spin
185
+ task_spinners[task] = spinner
186
+ elsif opts[:in_batches]
187
+ log message.sub(":spinner", "⠋")
109
188
  end
110
- task_spinners[task] = spinner
189
+
111
190
  started_at[task] = Time.now
112
191
  end
113
192
 
114
193
  finish = lambda do |task, i, result|
115
- spinner = task_spinners[task]
116
194
  time = (Time.now - started_at[task]).round(1)
117
195
 
196
+ success = result[:status] == "success"
197
+
118
198
  message =
119
199
  if result[:message]
120
200
  "(#{result[:message].lines.first.to_s.strip})"
@@ -124,24 +204,31 @@ module PgSync
124
204
 
125
205
  notices.concat(result[:notices])
126
206
 
127
- if result[:status] == "success"
128
- spinner.success(message)
207
+ if show_spinners
208
+ spinner = task_spinners[task]
209
+ if success
210
+ spinner.success(message)
211
+ else
212
+ spinner.error(message)
213
+ end
129
214
  else
130
- spinner.error(message)
131
- failed_tables << task_name(task)
132
- fail_sync(failed_tables) if opts[:fail_fast]
215
+ status = success ? "✔" : "✖"
216
+ log [status, display_item(task), message].join(" ")
133
217
  end
134
218
 
135
- unless spinner.send(:tty?)
136
- status = result[:status] == "success" ? "✔" : "✖"
137
- log [status, display_item(task), message].join(" ")
219
+ unless success
220
+ failed_tables << task_name(task)
221
+ fail_sync(failed_tables) if opts[:fail_fast]
138
222
  end
139
223
  end
140
224
 
141
225
  options = {start: start, finish: finish}
142
226
 
143
227
  jobs = opts[:jobs]
144
- if opts[:debug] || opts[:in_batches] || opts[:defer_constraints]
228
+
229
+ # disable multiple jobs for defer constraints and disable integrity
230
+ # so we can use a transaction to ensure a consistent snapshot
231
+ if opts[:debug] || opts[:in_batches] || opts[:defer_constraints] || opts[:defer_constraints_v2] || opts[:disable_integrity] || opts[:disable_integrity_v2]
145
232
  warning "--jobs ignored" if jobs
146
233
  jobs = 0
147
234
  end
@@ -171,9 +258,25 @@ module PgSync
171
258
  fail_sync(failed_tables) if failed_tables.any?
172
259
  end
173
260
 
261
+ # TODO add option to open transaction on source when manually specifying order of tables
174
262
  def maybe_defer_constraints
175
- if opts[:defer_constraints]
263
+ if opts[:disable_integrity] || opts[:disable_integrity_v2]
264
+ # create a transaction on the source
265
+ # to ensure we get a consistent snapshot
266
+ source.transaction do
267
+ yield
268
+ end
269
+ elsif opts[:defer_constraints] || opts[:defer_constraints_v2]
176
270
  destination.transaction do
271
+ if opts[:defer_constraints_v2]
272
+ table_constraints = non_deferrable_constraints(destination)
273
+ table_constraints.each do |table, constraints|
274
+ constraints.each do |constraint|
275
+ destination.execute("ALTER TABLE #{quote_ident_full(table)} ALTER CONSTRAINT #{quote_ident(constraint)} DEFERRABLE")
276
+ end
277
+ end
278
+ end
279
+
177
280
  destination.execute("SET CONSTRAINTS ALL DEFERRED")
178
281
 
179
282
  # create a transaction on the source
@@ -181,6 +284,20 @@ module PgSync
181
284
  source.transaction do
182
285
  yield
183
286
  end
287
+
288
+ # set them back
289
+ # there are 3 modes: DEFERRABLE INITIALLY DEFERRED, DEFERRABLE INITIALLY IMMEDIATE, and NOT DEFERRABLE
290
+ # we only update NOT DEFERRABLE
291
+ # https://www.postgresql.org/docs/current/sql-set-constraints.html
292
+ if opts[:defer_constraints_v2]
293
+ destination.execute("SET CONSTRAINTS ALL IMMEDIATE")
294
+
295
+ table_constraints.each do |table, constraints|
296
+ constraints.each do |constraint|
297
+ destination.execute("ALTER TABLE #{quote_ident_full(table)} ALTER CONSTRAINT #{quote_ident(constraint)} NOT DEFERRABLE")
298
+ end
299
+ end
300
+ end
184
301
  end
185
302
  else
186
303
  yield
@@ -3,7 +3,7 @@ module PgSync
3
3
  include Utils
4
4
 
5
5
  attr_reader :source, :destination, :config, :table, :opts
6
- attr_accessor :from_columns, :to_columns
6
+ attr_accessor :from_columns, :to_columns, :from_sequences, :to_sequences, :to_primary_key
7
7
 
8
8
  def initialize(source:, destination:, config:, table:, opts:)
9
9
  @source = source
@@ -11,6 +11,8 @@ module PgSync
11
11
  @config = config
12
12
  @table = table
13
13
  @opts = opts
14
+ @from_sequences = []
15
+ @to_sequences = []
14
16
  end
15
17
 
16
18
  def quoted_table
@@ -39,14 +41,6 @@ module PgSync
39
41
  @shared_fields ||= to_fields & from_fields
40
42
  end
41
43
 
42
- def from_sequences
43
- @from_sequences ||= opts[:no_sequences] ? [] : source.sequences(table, shared_fields)
44
- end
45
-
46
- def to_sequences
47
- @to_sequences ||= opts[:no_sequences] ? [] : destination.sequences(table, shared_fields)
48
- end
49
-
50
44
  def shared_sequences
51
45
  @shared_sequences ||= to_sequences & from_sequences
52
46
  end
@@ -88,15 +82,10 @@ module PgSync
88
82
  sql_clause << " #{opts[:sql]}" if opts[:sql]
89
83
 
90
84
  bad_fields = opts[:no_rules] ? [] : config["data_rules"]
91
- primary_key = destination.primary_key(table)
85
+ primary_key = to_primary_key
92
86
  copy_fields = shared_fields.map { |f| f2 = bad_fields.to_a.find { |bf, _| rule_match?(table, f, bf) }; f2 ? "#{apply_strategy(f2[1], table, f, primary_key)} AS #{quote_ident(f)}" : "#{quoted_table}.#{quote_ident(f)}" }.join(", ")
93
87
  fields = shared_fields.map { |f| quote_ident(f) }.join(", ")
94
88
 
95
- seq_values = {}
96
- shared_sequences.each do |seq|
97
- seq_values[seq] = source.last_value(seq)
98
- end
99
-
100
89
  copy_to_command = "COPY (SELECT #{copy_fields} FROM #{quoted_table}#{sql_clause}) TO STDOUT"
101
90
  if opts[:in_batches]
102
91
  raise Error, "No primary key" if primary_key.empty?
@@ -156,15 +145,18 @@ module PgSync
156
145
  destination.execute("INSERT INTO #{quoted_table} (SELECT * FROM #{quote_ident_full(temp_table)}) ON CONFLICT (#{on_conflict}) DO #{action}")
157
146
  else
158
147
  # use delete instead of truncate for foreign keys
159
- if opts[:defer_constraints]
148
+ if opts[:defer_constraints] || opts[:defer_constraints_v2]
160
149
  destination.execute("DELETE FROM #{quoted_table}")
161
150
  else
162
151
  destination.truncate(table)
163
152
  end
164
153
  copy(copy_to_command, dest_table: table, dest_fields: fields)
165
154
  end
166
- seq_values.each do |seq, value|
167
- destination.execute("SELECT setval(#{escape(seq)}, #{escape(value)})")
155
+
156
+ # update sequences
157
+ shared_sequences.each do |seq|
158
+ value = source.last_value(seq)
159
+ destination.execute("SELECT setval(#{escape(quote_ident_full(seq))}, #{escape(value)})")
168
160
  end
169
161
 
170
162
  {status: "success"}
@@ -214,6 +206,10 @@ module PgSync
214
206
 
215
207
  def copy(source_command, dest_table:, dest_fields:)
216
208
  destination_command = "COPY #{quote_ident_full(dest_table)} (#{dest_fields}) FROM STDIN"
209
+
210
+ source.log_sql(source_command)
211
+ destination.log_sql(destination_command)
212
+
217
213
  destination.conn.copy_data(destination_command) do
218
214
  source.conn.copy_data(source_command) do
219
215
  while (row = source.conn.get_copy_data)
@@ -275,7 +271,7 @@ module PgSync
275
271
  end
276
272
 
277
273
  def maybe_disable_triggers
278
- if opts[:disable_integrity] || opts[:disable_user_triggers]
274
+ if opts[:disable_integrity] || opts[:disable_integrity_v2] || opts[:disable_user_triggers]
279
275
  destination.transaction do
280
276
  triggers = destination.triggers(table)
281
277
  triggers.select! { |t| t["enabled"] == "t" }
@@ -283,7 +279,17 @@ module PgSync
283
279
  integrity_triggers = internal_triggers.select { |t| t["integrity"] == "t" }
284
280
  restore_triggers = []
285
281
 
286
- if opts[:disable_integrity]
282
+ # both --disable-integrity options require superuser privileges
283
+ # however, only v2 works on Amazon RDS, which added specific support for it
284
+ # https://aws.amazon.com/about-aws/whats-new/2014/11/10/amazon-rds-postgresql-read-replicas/
285
+ #
286
+ # session_replication_role disables more than foreign keys (like triggers and rules)
287
+ # this is probably fine, but keep the current default for now
288
+ if opts[:disable_integrity_v2] || (opts[:disable_integrity] && rds?)
289
+ # SET LOCAL lasts until the end of the transaction
290
+ # https://www.postgresql.org/docs/current/sql-set.html
291
+ destination.execute("SET LOCAL session_replication_role = replica")
292
+ elsif opts[:disable_integrity]
287
293
  integrity_triggers.each do |trigger|
288
294
  destination.execute("ALTER TABLE #{quoted_table} DISABLE TRIGGER #{quote_ident(trigger["name"])}")
289
295
  end
@@ -311,5 +317,9 @@ module PgSync
311
317
  yield
312
318
  end
313
319
  end
320
+
321
+ def rds?
322
+ destination.execute("SELECT name, setting FROM pg_settings WHERE name LIKE 'rds.%'").any?
323
+ end
314
324
  end
315
325
  end
@@ -148,7 +148,7 @@ module PgSync
148
148
  regex = Regexp.new('\A' + Regexp.escape(value).gsub('\*','[^\.]*') + '\z')
149
149
  tables.reject! { |t| regex.match(t.full_name) || regex.match(t.name) }
150
150
  else
151
- tables -= [fully_resolve(to_table(value))]
151
+ tables -= [fully_resolve(to_table(value), error: false)].compact
152
152
  end
153
153
  end
154
154
 
@@ -181,9 +181,11 @@ module PgSync
181
181
  end
182
182
 
183
183
  # for tables without a schema, find the table in the search path
184
- def fully_resolve(table)
184
+ def fully_resolve(table, error: true)
185
185
  return table if table.schema
186
- no_schema_tables[table.name] || (raise Error, "Table not found in source: #{table.name}")
186
+ resolved_table = no_schema_tables[table.name]
187
+ raise Error, "Table not found in source: #{table.name}" if !resolved_table && error
188
+ resolved_table
187
189
  end
188
190
 
189
191
  # parse command line arguments and YAML
@@ -3,7 +3,8 @@ module PgSync
3
3
  COLOR_CODES = {
4
4
  red: 31,
5
5
  green: 32,
6
- yellow: 33
6
+ yellow: 33,
7
+ cyan: 36
7
8
  }
8
9
 
9
10
  def log(message = nil)
@@ -59,7 +60,7 @@ module PgSync
59
60
  end
60
61
 
61
62
  def quote_ident_full(ident)
62
- if ident.is_a?(Table)
63
+ if ident.is_a?(Table) || ident.is_a?(Sequence)
63
64
  [quote_ident(ident.schema), quote_ident(ident.name)].join(".")
64
65
  else # temp table names are strings
65
66
  quote_ident(ident)
@@ -1,3 +1,3 @@
1
1
  module PgSync
2
- VERSION = "0.6.0"
2
+ VERSION = "0.6.5"
3
3
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: pgsync
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.6.0
4
+ version: 0.6.5
5
5
  platform: ruby
6
6
  authors:
7
7
  - Andrew Kane
8
8
  autorequire:
9
9
  bindir: exe
10
10
  cert_chain: []
11
- date: 2020-06-07 00:00:00.000000000 Z
11
+ date: 2020-07-10 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: parallel
@@ -44,14 +44,14 @@ dependencies:
44
44
  requirements:
45
45
  - - ">="
46
46
  - !ruby/object:Gem::Version
47
- version: 4.8.1
47
+ version: 4.8.2
48
48
  type: :runtime
49
49
  prerelease: false
50
50
  version_requirements: !ruby/object:Gem::Requirement
51
51
  requirements:
52
52
  - - ">="
53
53
  - !ruby/object:Gem::Version
54
- version: 4.8.1
54
+ version: 4.8.2
55
55
  - !ruby/object:Gem::Dependency
56
56
  name: tty-spinner
57
57
  requirement: !ruby/object:Gem::Requirement
@@ -125,6 +125,7 @@ files:
125
125
  - lib/pgsync/data_source.rb
126
126
  - lib/pgsync/init.rb
127
127
  - lib/pgsync/schema_sync.rb
128
+ - lib/pgsync/sequence.rb
128
129
  - lib/pgsync/sync.rb
129
130
  - lib/pgsync/table.rb
130
131
  - lib/pgsync/table_sync.rb