pgsync 0.6.0 → 0.6.5
Sign up to get free protection for your applications and to get access to all the features.
Potentially problematic release.
This version of pgsync might be problematic. Click here for more details.
- checksums.yaml +4 -4
- data/CHANGELOG.md +23 -0
- data/README.md +65 -8
- data/lib/pgsync.rb +1 -0
- data/lib/pgsync/client.rb +63 -28
- data/lib/pgsync/data_source.rb +21 -34
- data/lib/pgsync/init.rb +31 -2
- data/lib/pgsync/sequence.rb +29 -0
- data/lib/pgsync/sync.rb +8 -7
- data/lib/pgsync/table_sync.rb +139 -22
- data/lib/pgsync/task.rb +30 -20
- data/lib/pgsync/task_resolver.rb +5 -3
- data/lib/pgsync/utils.rb +3 -2
- data/lib/pgsync/version.rb +1 -1
- metadata +5 -4
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: e3ef4779b0d0962cfb0a23bb8e1ec4e6a03923e1324a77280499d3c023aa7673
|
4
|
+
data.tar.gz: 59aa2f0868e3a1a920547326a49b86460ab2d8349eb0be68989899ca3bf7ada2
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 9c6761c05533930479d39fcc318bd1c2601afa06ba50379c94bdaa69e3751fa3229559aa31806778cf17bdf79af0510bdbcb5d0c33dd6e4fdb1e164cda9ea49f
|
7
|
+
data.tar.gz: f8509b918137d97f80044ecac181a9499c128891d33467d91f2f74e6e49248f2ebbacee8bf6233c959c923d583ef8e14cd76b25d19b6b58c0d4166a7cc81af56
|
data/CHANGELOG.md
CHANGED
@@ -1,3 +1,26 @@
|
|
1
|
+
## 0.6.5 (2020-07-10)
|
2
|
+
|
3
|
+
- Improved help
|
4
|
+
|
5
|
+
## 0.6.4 (2020-06-10)
|
6
|
+
|
7
|
+
- Log SQL with `--debug` option
|
8
|
+
- Improved sequence queries
|
9
|
+
|
10
|
+
## 0.6.3 (2020-06-09)
|
11
|
+
|
12
|
+
- Added `--defer-constraints-v2` option
|
13
|
+
- Ensure consistent source snapshot with `--disable-integrity`
|
14
|
+
|
15
|
+
## 0.6.2 (2020-06-09)
|
16
|
+
|
17
|
+
- Added support for `--disable-integrity` on Amazon RDS
|
18
|
+
- Fixed error when excluded table not found in source
|
19
|
+
|
20
|
+
## 0.6.1 (2020-06-07)
|
21
|
+
|
22
|
+
- Added Django and Laravel integrations
|
23
|
+
|
1
24
|
## 0.6.0 (2020-06-07)
|
2
25
|
|
3
26
|
- Added messages for different column types and non-deferrable constraints
|
data/README.md
CHANGED
@@ -35,7 +35,7 @@ This creates `.pgsync.yml` for you to customize. We recommend checking this into
|
|
35
35
|
|
36
36
|
First, make sure your schema is set up in both databases. We recommend using a schema migration tool for this, but pgsync also provides a few [convenience methods](#schema). Once that’s done, you’re ready to sync data.
|
37
37
|
|
38
|
-
Sync
|
38
|
+
Sync tables
|
39
39
|
|
40
40
|
```sh
|
41
41
|
pgsync
|
@@ -198,20 +198,20 @@ Rules starting with `unique_` require the table to have a single column primary
|
|
198
198
|
|
199
199
|
Foreign keys can make it difficult to sync data. Three options are:
|
200
200
|
|
201
|
-
1.
|
202
|
-
2.
|
203
|
-
3. Disable foreign key triggers, which can silently break referential integrity
|
201
|
+
1. Defer constraints (recommended)
|
202
|
+
2. Manually specify the order of tables
|
203
|
+
3. Disable foreign key triggers, which can silently break referential integrity (not recommended)
|
204
204
|
|
205
|
-
|
205
|
+
To defer constraints, use:
|
206
206
|
|
207
207
|
```sh
|
208
|
-
pgsync
|
208
|
+
pgsync --defer-constraints-v2
|
209
209
|
```
|
210
210
|
|
211
|
-
|
211
|
+
To manually specify the order of tables, use `--jobs 1` so tables are synced one-at-a-time.
|
212
212
|
|
213
213
|
```sh
|
214
|
-
pgsync --
|
214
|
+
pgsync table1,table2,table3 --jobs 1
|
215
215
|
```
|
216
216
|
|
217
217
|
To disable foreign key triggers and potentially break referential integrity, use:
|
@@ -220,6 +220,8 @@ To disable foreign key triggers and potentially break referential integrity, use
|
|
220
220
|
pgsync --disable-integrity
|
221
221
|
```
|
222
222
|
|
223
|
+
This requires superuser privileges on the `to` database. If syncing to (not from) Amazon RDS, use the `rds_superuser` role. If syncing to (not from) Heroku, there doesn’t appear to be a way to disable integrity.
|
224
|
+
|
223
225
|
## Triggers
|
224
226
|
|
225
227
|
Disable user triggers with:
|
@@ -262,6 +264,57 @@ This creates `.pgsync-db2.yml` for you to edit. Specify a database in commands w
|
|
262
264
|
pgsync --db db2
|
263
265
|
```
|
264
266
|
|
267
|
+
## Integrations
|
268
|
+
|
269
|
+
- [Django](#django)
|
270
|
+
- [Heroku](#heroku)
|
271
|
+
- [Laravel](#laravel)
|
272
|
+
- [Rails](#rails)
|
273
|
+
|
274
|
+
### Django
|
275
|
+
|
276
|
+
If you run `pgsync --init` in a Django project, migrations will be excluded in `.pgsync.yml`.
|
277
|
+
|
278
|
+
```yml
|
279
|
+
exclude:
|
280
|
+
- django_migrations
|
281
|
+
```
|
282
|
+
|
283
|
+
### Heroku
|
284
|
+
|
285
|
+
If you run `pgsync --init` in a Heroku project, the `from` database will be set in `.pgsync.yml`.
|
286
|
+
|
287
|
+
```yml
|
288
|
+
from: $(heroku config:get DATABASE_URL)?sslmode=require
|
289
|
+
```
|
290
|
+
|
291
|
+
### Laravel
|
292
|
+
|
293
|
+
If you run `pgsync --init` in a Laravel project, migrations will be excluded in `.pgsync.yml`.
|
294
|
+
|
295
|
+
```yml
|
296
|
+
exclude:
|
297
|
+
- migrations
|
298
|
+
```
|
299
|
+
|
300
|
+
### Rails
|
301
|
+
|
302
|
+
If you run `pgsync --init` in a Rails project, Active Record metadata and schema migrations will be excluded in `.pgsync.yml`.
|
303
|
+
|
304
|
+
```yml
|
305
|
+
exclude:
|
306
|
+
- ar_internal_metadata
|
307
|
+
- schema_migrations
|
308
|
+
```
|
309
|
+
|
310
|
+
## Debugging
|
311
|
+
|
312
|
+
To view the SQL that’s run, use:
|
313
|
+
|
314
|
+
```sh
|
315
|
+
pgsync --debug
|
316
|
+
```
|
317
|
+
|
265
318
|
## Other Commands
|
266
319
|
|
267
320
|
Help
|
@@ -337,6 +390,10 @@ Also check out:
|
|
337
390
|
|
338
391
|
Inspired by [heroku-pg-transfer](https://github.com/ddollar/heroku-pg-transfer).
|
339
392
|
|
393
|
+
## History
|
394
|
+
|
395
|
+
View the [changelog](https://github.com/ankane/pgsync/blob/master/CHANGELOG.md)
|
396
|
+
|
340
397
|
## Contributing
|
341
398
|
|
342
399
|
Everyone is encouraged to help improve this project. Here are a few ways you can help:
|
data/lib/pgsync.rb
CHANGED
data/lib/pgsync/client.rb
CHANGED
@@ -39,39 +39,74 @@ module PgSync
|
|
39
39
|
def slop_options
|
40
40
|
o = Slop::Options.new
|
41
41
|
o.banner = %{Usage:
|
42
|
-
|
42
|
+
pgsync [tables,groups] [sql] [options]}
|
43
43
|
|
44
|
-
|
45
|
-
o.string "-
|
46
|
-
o.string "-
|
47
|
-
|
48
|
-
o.
|
44
|
+
# not shown
|
45
|
+
o.string "-t", "--tables", "tables to sync", help: false
|
46
|
+
o.string "-g", "--groups", "groups to sync", help: false
|
47
|
+
|
48
|
+
o.separator ""
|
49
|
+
o.separator "Table options:"
|
50
|
+
o.string "--exclude", "tables to exclude"
|
49
51
|
o.string "--schemas", "schemas to sync"
|
50
|
-
o.
|
51
|
-
|
52
|
-
o.
|
53
|
-
o.
|
54
|
-
o.boolean "--
|
55
|
-
o.boolean "--debug", "debug", default: false
|
56
|
-
o.boolean "--list", "list", default: false
|
57
|
-
o.boolean "--overwrite", "overwrite existing rows", default: false, help: false
|
52
|
+
o.boolean "--all-schemas", "sync all schemas", default: false
|
53
|
+
|
54
|
+
o.separator ""
|
55
|
+
o.separator "Row options:"
|
56
|
+
o.boolean "--overwrite", "overwrite existing rows", default: false
|
58
57
|
o.boolean "--preserve", "preserve existing rows", default: false
|
59
58
|
o.boolean "--truncate", "truncate existing rows", default: false
|
60
|
-
|
61
|
-
o.
|
62
|
-
o.
|
63
|
-
o.boolean "--
|
64
|
-
o.boolean "--no-sequences", "do not sync sequences", default: false
|
65
|
-
o.boolean "--init", "init", default: false
|
66
|
-
o.boolean "--in-batches", "in batches", default: false, help: false
|
67
|
-
o.integer "--batch-size", "batch size", default: 10000, help: false
|
68
|
-
o.float "--sleep", "sleep", default: 0, help: false
|
69
|
-
o.boolean "--fail-fast", "stop on the first failed table", default: false
|
70
|
-
o.boolean "--defer-constraints", "defer constraints", default: false
|
71
|
-
o.boolean "--disable-user-triggers", "disable non-system triggers", default: false
|
59
|
+
|
60
|
+
o.separator ""
|
61
|
+
o.separator "Foreign key options:"
|
62
|
+
o.boolean "--defer-constraints-v2", "defer constraints", default: false
|
72
63
|
o.boolean "--disable-integrity", "disable foreign key triggers", default: false
|
73
|
-
o.
|
74
|
-
|
64
|
+
o.integer "-j", "--jobs", "number of tables to sync at a time"
|
65
|
+
|
66
|
+
# replaced by v2
|
67
|
+
o.boolean "--defer-constraints", "defer constraints", default: false, help: false
|
68
|
+
# private, for testing
|
69
|
+
o.boolean "--disable-integrity-v2", "disable foreign key triggers", default: false, help: false
|
70
|
+
|
71
|
+
o.separator ""
|
72
|
+
o.separator "Schema options:"
|
73
|
+
o.boolean "--schema-first", "sync schema first", default: false
|
74
|
+
o.boolean "--schema-only", "sync schema only", default: false
|
75
|
+
|
76
|
+
o.separator ""
|
77
|
+
o.separator "Config options:"
|
78
|
+
# technically, defaults to searching path for .pgsync.yml, but this is simpler
|
79
|
+
o.string "--config", "config file (defaults to .pgsync.yml)"
|
80
|
+
o.string "-d", "--db", "database-specific config file"
|
81
|
+
|
82
|
+
o.separator ""
|
83
|
+
o.separator "Connection options:"
|
84
|
+
o.string "--from", "source database URL"
|
85
|
+
o.string "--to", "destination database URL"
|
86
|
+
o.boolean "--to-safe", "confirms destination is safe (when not localhost)", default: false
|
87
|
+
|
88
|
+
o.separator ""
|
89
|
+
o.separator "Other options:"
|
90
|
+
o.boolean "--debug", "show SQL statements", default: false
|
91
|
+
o.boolean "--disable-user-triggers", "disable non-system triggers", default: false
|
92
|
+
o.boolean "--fail-fast", "stop on the first failed table", default: false
|
93
|
+
o.boolean "--no-rules", "don't apply data rules", default: false
|
94
|
+
o.boolean "--no-sequences", "don't sync sequences", default: false
|
95
|
+
|
96
|
+
# not shown in help
|
97
|
+
# o.separator ""
|
98
|
+
# o.separator "Append-only table options:"
|
99
|
+
o.boolean "--in-batches", "sync in batches", default: false, help: false
|
100
|
+
o.integer "--batch-size", "batch size", default: 10000, help: false
|
101
|
+
o.float "--sleep", "time to sleep between batches", default: 0, help: false
|
102
|
+
|
103
|
+
o.separator ""
|
104
|
+
o.separator "Other commands:"
|
105
|
+
o.boolean "--init", "create config file", default: false
|
106
|
+
o.boolean "--list", "list tables", default: false
|
107
|
+
o.boolean "-h", "--help", "print help"
|
108
|
+
o.boolean "-v", "--version", "print version"
|
109
|
+
|
75
110
|
o
|
76
111
|
end
|
77
112
|
end
|
data/lib/pgsync/data_source.rb
CHANGED
@@ -4,8 +4,10 @@ module PgSync
|
|
4
4
|
|
5
5
|
attr_reader :url
|
6
6
|
|
7
|
-
def initialize(url)
|
7
|
+
def initialize(url, name:, debug:)
|
8
8
|
@url = url
|
9
|
+
@name = name
|
10
|
+
@debug = debug
|
9
11
|
end
|
10
12
|
|
11
13
|
def exists?
|
@@ -50,10 +52,6 @@ module PgSync
|
|
50
52
|
table_set.include?(table)
|
51
53
|
end
|
52
54
|
|
53
|
-
def sequences(table, columns)
|
54
|
-
execute("SELECT #{columns.map { |f| "pg_get_serial_sequence(#{escape("#{quote_ident_full(table)}")}, #{escape(f)}) AS #{quote_ident(f)}" }.join(", ")}").first.values.compact
|
55
|
-
end
|
56
|
-
|
57
55
|
def max_id(table, primary_key, sql_clause = nil)
|
58
56
|
execute("SELECT MAX(#{quote_ident(primary_key)}) FROM #{quote_ident_full(table)}#{sql_clause}").first["max"].to_i
|
59
57
|
end
|
@@ -62,39 +60,14 @@ module PgSync
|
|
62
60
|
execute("SELECT MIN(#{quote_ident(primary_key)}) FROM #{quote_ident_full(table)}#{sql_clause}").first["min"].to_i
|
63
61
|
end
|
64
62
|
|
65
|
-
# this value comes from pg_get_serial_sequence which is already quoted
|
66
63
|
def last_value(seq)
|
67
|
-
execute("SELECT last_value FROM #{seq}").first["last_value"]
|
64
|
+
execute("SELECT last_value FROM #{quote_ident_full(seq)}").first["last_value"]
|
68
65
|
end
|
69
66
|
|
70
67
|
def truncate(table)
|
71
68
|
execute("TRUNCATE #{quote_ident_full(table)} CASCADE")
|
72
69
|
end
|
73
70
|
|
74
|
-
# https://stackoverflow.com/a/20537829
|
75
|
-
# TODO can simplify with array_position in Postgres 9.5+
|
76
|
-
def primary_key(table)
|
77
|
-
query = <<~SQL
|
78
|
-
SELECT
|
79
|
-
pg_attribute.attname,
|
80
|
-
format_type(pg_attribute.atttypid, pg_attribute.atttypmod),
|
81
|
-
pg_attribute.attnum,
|
82
|
-
pg_index.indkey
|
83
|
-
FROM
|
84
|
-
pg_index, pg_class, pg_attribute, pg_namespace
|
85
|
-
WHERE
|
86
|
-
nspname = $1 AND
|
87
|
-
relname = $2 AND
|
88
|
-
indrelid = pg_class.oid AND
|
89
|
-
pg_class.relnamespace = pg_namespace.oid AND
|
90
|
-
pg_attribute.attrelid = pg_class.oid AND
|
91
|
-
pg_attribute.attnum = any(pg_index.indkey) AND
|
92
|
-
indisprimary
|
93
|
-
SQL
|
94
|
-
rows = execute(query, [table.schema, table.name])
|
95
|
-
rows.sort_by { |r| r["indkey"].split(" ").index(r["attnum"]) }.map { |r| r["attname"] }
|
96
|
-
end
|
97
|
-
|
98
71
|
def triggers(table)
|
99
72
|
query = <<~SQL
|
100
73
|
SELECT
|
@@ -148,20 +121,34 @@ module PgSync
|
|
148
121
|
end
|
149
122
|
|
150
123
|
def execute(query, params = [])
|
124
|
+
log_sql query, params
|
151
125
|
conn.exec_params(query, params).to_a
|
152
126
|
end
|
153
127
|
|
154
128
|
def transaction
|
155
129
|
if conn.transaction_status == 0
|
156
130
|
# not currently in transaction
|
157
|
-
|
158
|
-
|
159
|
-
|
131
|
+
log_sql "BEGIN"
|
132
|
+
result =
|
133
|
+
conn.transaction do
|
134
|
+
yield
|
135
|
+
end
|
136
|
+
log_sql "COMMIT"
|
137
|
+
result
|
160
138
|
else
|
161
139
|
yield
|
162
140
|
end
|
163
141
|
end
|
164
142
|
|
143
|
+
# TODO log time for each statement
|
144
|
+
def log_sql(query, params = {})
|
145
|
+
if @debug
|
146
|
+
message = "#{colorize("[#{@name}]", :cyan)} #{query.gsub(/\s+/, " ").strip}"
|
147
|
+
message = "#{message} #{params.inspect}" if params.any?
|
148
|
+
log message
|
149
|
+
end
|
150
|
+
end
|
151
|
+
|
165
152
|
private
|
166
153
|
|
167
154
|
def concurrent_id
|
data/lib/pgsync/init.rb
CHANGED
@@ -30,8 +30,19 @@ module PgSync
|
|
30
30
|
if rails?
|
31
31
|
<<~EOS
|
32
32
|
exclude:
|
33
|
-
- schema_migrations
|
34
33
|
- ar_internal_metadata
|
34
|
+
- schema_migrations
|
35
|
+
EOS
|
36
|
+
elsif django?
|
37
|
+
# TODO exclude other tables?
|
38
|
+
<<~EOS
|
39
|
+
exclude:
|
40
|
+
- django_migrations
|
41
|
+
EOS
|
42
|
+
elsif laravel?
|
43
|
+
<<~EOS
|
44
|
+
exclude:
|
45
|
+
- migrations
|
35
46
|
EOS
|
36
47
|
else
|
37
48
|
<<~EOS
|
@@ -50,12 +61,30 @@ module PgSync
|
|
50
61
|
end
|
51
62
|
end
|
52
63
|
|
64
|
+
def django?
|
65
|
+
file_exists?("manage.py", /django/i)
|
66
|
+
end
|
67
|
+
|
53
68
|
def heroku?
|
54
69
|
`git remote -v 2>&1`.include?("git.heroku.com") rescue false
|
55
70
|
end
|
56
71
|
|
72
|
+
def laravel?
|
73
|
+
file_exists?("artisan")
|
74
|
+
end
|
75
|
+
|
57
76
|
def rails?
|
58
|
-
|
77
|
+
file_exists?("bin/rails")
|
78
|
+
end
|
79
|
+
|
80
|
+
def file_exists?(path, contents = nil)
|
81
|
+
if contents
|
82
|
+
File.read(path).match(contents)
|
83
|
+
else
|
84
|
+
File.exist?(path)
|
85
|
+
end
|
86
|
+
rescue
|
87
|
+
false
|
59
88
|
end
|
60
89
|
end
|
61
90
|
end
|
@@ -0,0 +1,29 @@
|
|
1
|
+
# minimal class to keep schema and sequence name separate
|
2
|
+
module PgSync
|
3
|
+
class Sequence
|
4
|
+
attr_reader :schema, :name, :column
|
5
|
+
|
6
|
+
def initialize(schema, name, column:)
|
7
|
+
@schema = schema
|
8
|
+
@name = name
|
9
|
+
@column = column
|
10
|
+
end
|
11
|
+
|
12
|
+
def full_name
|
13
|
+
"#{schema}.#{name}"
|
14
|
+
end
|
15
|
+
|
16
|
+
def eql?(other)
|
17
|
+
other.schema == schema && other.name == name
|
18
|
+
end
|
19
|
+
|
20
|
+
# override hash when overriding eql?
|
21
|
+
def hash
|
22
|
+
[schema, name].hash
|
23
|
+
end
|
24
|
+
|
25
|
+
def to_s
|
26
|
+
full_name
|
27
|
+
end
|
28
|
+
end
|
29
|
+
end
|
data/lib/pgsync/sync.rb
CHANGED
@@ -34,13 +34,13 @@ module PgSync
|
|
34
34
|
raise Error, "Danger! Add `to_safe: true` to `.pgsync.yml` if the destination is not localhost or 127.0.0.1"
|
35
35
|
end
|
36
36
|
|
37
|
+
print_description("From", source)
|
38
|
+
print_description("To", destination)
|
39
|
+
|
37
40
|
if (opts[:preserve] || opts[:overwrite]) && destination.server_version_num < 90500
|
38
41
|
raise Error, "Postgres 9.5+ is required for --preserve and --overwrite"
|
39
42
|
end
|
40
43
|
|
41
|
-
print_description("From", source)
|
42
|
-
print_description("To", destination)
|
43
|
-
|
44
44
|
resolver = TaskResolver.new(args: args, opts: opts, source: source, destination: destination, config: config, first_schema: first_schema)
|
45
45
|
tasks =
|
46
46
|
resolver.tasks.map do |task|
|
@@ -126,19 +126,20 @@ module PgSync
|
|
126
126
|
end
|
127
127
|
|
128
128
|
def source
|
129
|
-
@source ||= data_source(@options[:from])
|
129
|
+
@source ||= data_source(@options[:from], "from")
|
130
130
|
end
|
131
131
|
|
132
132
|
def destination
|
133
|
-
@destination ||= data_source(@options[:to])
|
133
|
+
@destination ||= data_source(@options[:to], "to")
|
134
134
|
end
|
135
135
|
|
136
|
-
def data_source(url)
|
137
|
-
ds = DataSource.new(url)
|
136
|
+
def data_source(url, name)
|
137
|
+
ds = DataSource.new(url, name: name, debug: @options[:debug])
|
138
138
|
ObjectSpace.define_finalizer(self, self.class.finalize(ds))
|
139
139
|
ds
|
140
140
|
end
|
141
141
|
|
142
|
+
# ideally aliases would work, but haven't found a nice way to do this
|
142
143
|
def resolve_source(source)
|
143
144
|
if source
|
144
145
|
source = source.dup
|
data/lib/pgsync/table_sync.rb
CHANGED
@@ -17,6 +17,10 @@ module PgSync
|
|
17
17
|
|
18
18
|
add_columns
|
19
19
|
|
20
|
+
add_primary_keys
|
21
|
+
|
22
|
+
add_sequences unless opts[:no_sequences]
|
23
|
+
|
20
24
|
show_notes
|
21
25
|
|
22
26
|
# don't sync tables with no shared fields
|
@@ -24,8 +28,6 @@ module PgSync
|
|
24
28
|
run_tasks(tasks.reject { |task| task.shared_fields.empty? })
|
25
29
|
end
|
26
30
|
|
27
|
-
# TODO only query specific tables
|
28
|
-
# TODO add sequences, primary keys, etc
|
29
31
|
def add_columns
|
30
32
|
source_columns = columns(source)
|
31
33
|
destination_columns = columns(destination)
|
@@ -36,6 +38,79 @@ module PgSync
|
|
36
38
|
end
|
37
39
|
end
|
38
40
|
|
41
|
+
def add_primary_keys
|
42
|
+
destination_primary_keys = primary_keys(destination)
|
43
|
+
|
44
|
+
tasks.each do |task|
|
45
|
+
task.to_primary_key = destination_primary_keys[task.table] || []
|
46
|
+
end
|
47
|
+
end
|
48
|
+
|
49
|
+
def add_sequences
|
50
|
+
source_sequences = sequences(source)
|
51
|
+
destination_sequences = sequences(destination)
|
52
|
+
|
53
|
+
tasks.each do |task|
|
54
|
+
shared_columns = Set.new(task.shared_fields)
|
55
|
+
|
56
|
+
task.from_sequences = (source_sequences[task.table] || []).select { |s| shared_columns.include?(s.column) }
|
57
|
+
task.to_sequences = (destination_sequences[task.table] || []).select { |s| shared_columns.include?(s.column) }
|
58
|
+
end
|
59
|
+
end
|
60
|
+
|
61
|
+
def sequences(data_source)
|
62
|
+
query = <<~SQL
|
63
|
+
SELECT
|
64
|
+
nt.nspname as schema,
|
65
|
+
t.relname as table,
|
66
|
+
a.attname as column,
|
67
|
+
n.nspname as sequence_schema,
|
68
|
+
s.relname as sequence
|
69
|
+
FROM
|
70
|
+
pg_class s
|
71
|
+
INNER JOIN
|
72
|
+
pg_depend d ON d.objid = s.oid
|
73
|
+
INNER JOIN
|
74
|
+
pg_class t ON d.objid = s.oid AND d.refobjid = t.oid
|
75
|
+
INNER JOIN
|
76
|
+
pg_attribute a ON (d.refobjid, d.refobjsubid) = (a.attrelid, a.attnum)
|
77
|
+
INNER JOIN
|
78
|
+
pg_namespace n ON n.oid = s.relnamespace
|
79
|
+
INNER JOIN
|
80
|
+
pg_namespace nt ON nt.oid = t.relnamespace
|
81
|
+
WHERE
|
82
|
+
s.relkind = 'S'
|
83
|
+
SQL
|
84
|
+
data_source.execute(query).group_by { |r| Table.new(r["schema"], r["table"]) }.map do |k, v|
|
85
|
+
[k, v.map { |r| Sequence.new(r["sequence_schema"], r["sequence"], column: r["column"]) }]
|
86
|
+
end.to_h
|
87
|
+
end
|
88
|
+
|
89
|
+
def primary_keys(data_source)
|
90
|
+
# https://stackoverflow.com/a/20537829
|
91
|
+
# TODO can simplify with array_position in Postgres 9.5+
|
92
|
+
query = <<~SQL
|
93
|
+
SELECT
|
94
|
+
nspname AS schema,
|
95
|
+
relname AS table,
|
96
|
+
pg_attribute.attname AS column,
|
97
|
+
format_type(pg_attribute.atttypid, pg_attribute.atttypmod),
|
98
|
+
pg_attribute.attnum,
|
99
|
+
pg_index.indkey
|
100
|
+
FROM
|
101
|
+
pg_index, pg_class, pg_attribute, pg_namespace
|
102
|
+
WHERE
|
103
|
+
indrelid = pg_class.oid AND
|
104
|
+
pg_class.relnamespace = pg_namespace.oid AND
|
105
|
+
pg_attribute.attrelid = pg_class.oid AND
|
106
|
+
pg_attribute.attnum = any(pg_index.indkey) AND
|
107
|
+
indisprimary
|
108
|
+
SQL
|
109
|
+
data_source.execute(query).group_by { |r| Table.new(r["schema"], r["table"]) }.map do |k, v|
|
110
|
+
[k, v.sort_by { |r| r["indkey"].split(" ").index(r["attnum"]) }.map { |r| r["column"] }]
|
111
|
+
end.to_h
|
112
|
+
end
|
113
|
+
|
39
114
|
def show_notes
|
40
115
|
# for tables
|
41
116
|
resolver.notes.each do |note|
|
@@ -93,28 +168,33 @@ module PgSync
|
|
93
168
|
def run_tasks(tasks, &block)
|
94
169
|
notices = []
|
95
170
|
failed_tables = []
|
96
|
-
|
97
|
-
spinners = TTY::Spinner::Multi.new(format: :dots, output: output)
|
98
|
-
task_spinners = {}
|
99
171
|
started_at = {}
|
100
172
|
|
173
|
+
show_spinners = output.tty? && !opts[:in_batches] && !opts[:debug]
|
174
|
+
if show_spinners
|
175
|
+
spinners = TTY::Spinner::Multi.new(format: :dots, output: output)
|
176
|
+
task_spinners = {}
|
177
|
+
end
|
178
|
+
|
101
179
|
start = lambda do |task, i|
|
102
180
|
message = ":spinner #{display_item(task)}"
|
103
|
-
|
104
|
-
if
|
105
|
-
|
106
|
-
log message.sub(":spinner", "⠋")
|
107
|
-
else
|
181
|
+
|
182
|
+
if show_spinners
|
183
|
+
spinner = spinners.register(message)
|
108
184
|
spinner.auto_spin
|
185
|
+
task_spinners[task] = spinner
|
186
|
+
elsif opts[:in_batches]
|
187
|
+
log message.sub(":spinner", "⠋")
|
109
188
|
end
|
110
|
-
|
189
|
+
|
111
190
|
started_at[task] = Time.now
|
112
191
|
end
|
113
192
|
|
114
193
|
finish = lambda do |task, i, result|
|
115
|
-
spinner = task_spinners[task]
|
116
194
|
time = (Time.now - started_at[task]).round(1)
|
117
195
|
|
196
|
+
success = result[:status] == "success"
|
197
|
+
|
118
198
|
message =
|
119
199
|
if result[:message]
|
120
200
|
"(#{result[:message].lines.first.to_s.strip})"
|
@@ -124,24 +204,31 @@ module PgSync
|
|
124
204
|
|
125
205
|
notices.concat(result[:notices])
|
126
206
|
|
127
|
-
if
|
128
|
-
spinner
|
207
|
+
if show_spinners
|
208
|
+
spinner = task_spinners[task]
|
209
|
+
if success
|
210
|
+
spinner.success(message)
|
211
|
+
else
|
212
|
+
spinner.error(message)
|
213
|
+
end
|
129
214
|
else
|
130
|
-
|
131
|
-
|
132
|
-
fail_sync(failed_tables) if opts[:fail_fast]
|
215
|
+
status = success ? "✔" : "✖"
|
216
|
+
log [status, display_item(task), message].join(" ")
|
133
217
|
end
|
134
218
|
|
135
|
-
unless
|
136
|
-
|
137
|
-
|
219
|
+
unless success
|
220
|
+
failed_tables << task_name(task)
|
221
|
+
fail_sync(failed_tables) if opts[:fail_fast]
|
138
222
|
end
|
139
223
|
end
|
140
224
|
|
141
225
|
options = {start: start, finish: finish}
|
142
226
|
|
143
227
|
jobs = opts[:jobs]
|
144
|
-
|
228
|
+
|
229
|
+
# disable multiple jobs for defer constraints and disable integrity
|
230
|
+
# so we can use a transaction to ensure a consistent snapshot
|
231
|
+
if opts[:debug] || opts[:in_batches] || opts[:defer_constraints] || opts[:defer_constraints_v2] || opts[:disable_integrity] || opts[:disable_integrity_v2]
|
145
232
|
warning "--jobs ignored" if jobs
|
146
233
|
jobs = 0
|
147
234
|
end
|
@@ -171,9 +258,25 @@ module PgSync
|
|
171
258
|
fail_sync(failed_tables) if failed_tables.any?
|
172
259
|
end
|
173
260
|
|
261
|
+
# TODO add option to open transaction on source when manually specifying order of tables
|
174
262
|
def maybe_defer_constraints
|
175
|
-
if opts[:
|
263
|
+
if opts[:disable_integrity] || opts[:disable_integrity_v2]
|
264
|
+
# create a transaction on the source
|
265
|
+
# to ensure we get a consistent snapshot
|
266
|
+
source.transaction do
|
267
|
+
yield
|
268
|
+
end
|
269
|
+
elsif opts[:defer_constraints] || opts[:defer_constraints_v2]
|
176
270
|
destination.transaction do
|
271
|
+
if opts[:defer_constraints_v2]
|
272
|
+
table_constraints = non_deferrable_constraints(destination)
|
273
|
+
table_constraints.each do |table, constraints|
|
274
|
+
constraints.each do |constraint|
|
275
|
+
destination.execute("ALTER TABLE #{quote_ident_full(table)} ALTER CONSTRAINT #{quote_ident(constraint)} DEFERRABLE")
|
276
|
+
end
|
277
|
+
end
|
278
|
+
end
|
279
|
+
|
177
280
|
destination.execute("SET CONSTRAINTS ALL DEFERRED")
|
178
281
|
|
179
282
|
# create a transaction on the source
|
@@ -181,6 +284,20 @@ module PgSync
|
|
181
284
|
source.transaction do
|
182
285
|
yield
|
183
286
|
end
|
287
|
+
|
288
|
+
# set them back
|
289
|
+
# there are 3 modes: DEFERRABLE INITIALLY DEFERRED, DEFERRABLE INITIALLY IMMEDIATE, and NOT DEFERRABLE
|
290
|
+
# we only update NOT DEFERRABLE
|
291
|
+
# https://www.postgresql.org/docs/current/sql-set-constraints.html
|
292
|
+
if opts[:defer_constraints_v2]
|
293
|
+
destination.execute("SET CONSTRAINTS ALL IMMEDIATE")
|
294
|
+
|
295
|
+
table_constraints.each do |table, constraints|
|
296
|
+
constraints.each do |constraint|
|
297
|
+
destination.execute("ALTER TABLE #{quote_ident_full(table)} ALTER CONSTRAINT #{quote_ident(constraint)} NOT DEFERRABLE")
|
298
|
+
end
|
299
|
+
end
|
300
|
+
end
|
184
301
|
end
|
185
302
|
else
|
186
303
|
yield
|
data/lib/pgsync/task.rb
CHANGED
@@ -3,7 +3,7 @@ module PgSync
|
|
3
3
|
include Utils
|
4
4
|
|
5
5
|
attr_reader :source, :destination, :config, :table, :opts
|
6
|
-
attr_accessor :from_columns, :to_columns
|
6
|
+
attr_accessor :from_columns, :to_columns, :from_sequences, :to_sequences, :to_primary_key
|
7
7
|
|
8
8
|
def initialize(source:, destination:, config:, table:, opts:)
|
9
9
|
@source = source
|
@@ -11,6 +11,8 @@ module PgSync
|
|
11
11
|
@config = config
|
12
12
|
@table = table
|
13
13
|
@opts = opts
|
14
|
+
@from_sequences = []
|
15
|
+
@to_sequences = []
|
14
16
|
end
|
15
17
|
|
16
18
|
def quoted_table
|
@@ -39,14 +41,6 @@ module PgSync
|
|
39
41
|
@shared_fields ||= to_fields & from_fields
|
40
42
|
end
|
41
43
|
|
42
|
-
def from_sequences
|
43
|
-
@from_sequences ||= opts[:no_sequences] ? [] : source.sequences(table, shared_fields)
|
44
|
-
end
|
45
|
-
|
46
|
-
def to_sequences
|
47
|
-
@to_sequences ||= opts[:no_sequences] ? [] : destination.sequences(table, shared_fields)
|
48
|
-
end
|
49
|
-
|
50
44
|
def shared_sequences
|
51
45
|
@shared_sequences ||= to_sequences & from_sequences
|
52
46
|
end
|
@@ -88,15 +82,10 @@ module PgSync
|
|
88
82
|
sql_clause << " #{opts[:sql]}" if opts[:sql]
|
89
83
|
|
90
84
|
bad_fields = opts[:no_rules] ? [] : config["data_rules"]
|
91
|
-
primary_key =
|
85
|
+
primary_key = to_primary_key
|
92
86
|
copy_fields = shared_fields.map { |f| f2 = bad_fields.to_a.find { |bf, _| rule_match?(table, f, bf) }; f2 ? "#{apply_strategy(f2[1], table, f, primary_key)} AS #{quote_ident(f)}" : "#{quoted_table}.#{quote_ident(f)}" }.join(", ")
|
93
87
|
fields = shared_fields.map { |f| quote_ident(f) }.join(", ")
|
94
88
|
|
95
|
-
seq_values = {}
|
96
|
-
shared_sequences.each do |seq|
|
97
|
-
seq_values[seq] = source.last_value(seq)
|
98
|
-
end
|
99
|
-
|
100
89
|
copy_to_command = "COPY (SELECT #{copy_fields} FROM #{quoted_table}#{sql_clause}) TO STDOUT"
|
101
90
|
if opts[:in_batches]
|
102
91
|
raise Error, "No primary key" if primary_key.empty?
|
@@ -156,15 +145,18 @@ module PgSync
|
|
156
145
|
destination.execute("INSERT INTO #{quoted_table} (SELECT * FROM #{quote_ident_full(temp_table)}) ON CONFLICT (#{on_conflict}) DO #{action}")
|
157
146
|
else
|
158
147
|
# use delete instead of truncate for foreign keys
|
159
|
-
if opts[:defer_constraints]
|
148
|
+
if opts[:defer_constraints] || opts[:defer_constraints_v2]
|
160
149
|
destination.execute("DELETE FROM #{quoted_table}")
|
161
150
|
else
|
162
151
|
destination.truncate(table)
|
163
152
|
end
|
164
153
|
copy(copy_to_command, dest_table: table, dest_fields: fields)
|
165
154
|
end
|
166
|
-
|
167
|
-
|
155
|
+
|
156
|
+
# update sequences
|
157
|
+
shared_sequences.each do |seq|
|
158
|
+
value = source.last_value(seq)
|
159
|
+
destination.execute("SELECT setval(#{escape(quote_ident_full(seq))}, #{escape(value)})")
|
168
160
|
end
|
169
161
|
|
170
162
|
{status: "success"}
|
@@ -214,6 +206,10 @@ module PgSync
|
|
214
206
|
|
215
207
|
def copy(source_command, dest_table:, dest_fields:)
|
216
208
|
destination_command = "COPY #{quote_ident_full(dest_table)} (#{dest_fields}) FROM STDIN"
|
209
|
+
|
210
|
+
source.log_sql(source_command)
|
211
|
+
destination.log_sql(destination_command)
|
212
|
+
|
217
213
|
destination.conn.copy_data(destination_command) do
|
218
214
|
source.conn.copy_data(source_command) do
|
219
215
|
while (row = source.conn.get_copy_data)
|
@@ -275,7 +271,7 @@ module PgSync
|
|
275
271
|
end
|
276
272
|
|
277
273
|
def maybe_disable_triggers
|
278
|
-
if opts[:disable_integrity] || opts[:disable_user_triggers]
|
274
|
+
if opts[:disable_integrity] || opts[:disable_integrity_v2] || opts[:disable_user_triggers]
|
279
275
|
destination.transaction do
|
280
276
|
triggers = destination.triggers(table)
|
281
277
|
triggers.select! { |t| t["enabled"] == "t" }
|
@@ -283,7 +279,17 @@ module PgSync
|
|
283
279
|
integrity_triggers = internal_triggers.select { |t| t["integrity"] == "t" }
|
284
280
|
restore_triggers = []
|
285
281
|
|
286
|
-
|
282
|
+
# both --disable-integrity options require superuser privileges
|
283
|
+
# however, only v2 works on Amazon RDS, which added specific support for it
|
284
|
+
# https://aws.amazon.com/about-aws/whats-new/2014/11/10/amazon-rds-postgresql-read-replicas/
|
285
|
+
#
|
286
|
+
# session_replication_role disables more than foreign keys (like triggers and rules)
|
287
|
+
# this is probably fine, but keep the current default for now
|
288
|
+
if opts[:disable_integrity_v2] || (opts[:disable_integrity] && rds?)
|
289
|
+
# SET LOCAL lasts until the end of the transaction
|
290
|
+
# https://www.postgresql.org/docs/current/sql-set.html
|
291
|
+
destination.execute("SET LOCAL session_replication_role = replica")
|
292
|
+
elsif opts[:disable_integrity]
|
287
293
|
integrity_triggers.each do |trigger|
|
288
294
|
destination.execute("ALTER TABLE #{quoted_table} DISABLE TRIGGER #{quote_ident(trigger["name"])}")
|
289
295
|
end
|
@@ -311,5 +317,9 @@ module PgSync
|
|
311
317
|
yield
|
312
318
|
end
|
313
319
|
end
|
320
|
+
|
321
|
+
def rds?
|
322
|
+
destination.execute("SELECT name, setting FROM pg_settings WHERE name LIKE 'rds.%'").any?
|
323
|
+
end
|
314
324
|
end
|
315
325
|
end
|
data/lib/pgsync/task_resolver.rb
CHANGED
@@ -148,7 +148,7 @@ module PgSync
|
|
148
148
|
regex = Regexp.new('\A' + Regexp.escape(value).gsub('\*','[^\.]*') + '\z')
|
149
149
|
tables.reject! { |t| regex.match(t.full_name) || regex.match(t.name) }
|
150
150
|
else
|
151
|
-
tables -= [fully_resolve(to_table(value))]
|
151
|
+
tables -= [fully_resolve(to_table(value), error: false)].compact
|
152
152
|
end
|
153
153
|
end
|
154
154
|
|
@@ -181,9 +181,11 @@ module PgSync
|
|
181
181
|
end
|
182
182
|
|
183
183
|
# for tables without a schema, find the table in the search path
|
184
|
-
def fully_resolve(table)
|
184
|
+
def fully_resolve(table, error: true)
|
185
185
|
return table if table.schema
|
186
|
-
no_schema_tables[table.name]
|
186
|
+
resolved_table = no_schema_tables[table.name]
|
187
|
+
raise Error, "Table not found in source: #{table.name}" if !resolved_table && error
|
188
|
+
resolved_table
|
187
189
|
end
|
188
190
|
|
189
191
|
# parse command line arguments and YAML
|
data/lib/pgsync/utils.rb
CHANGED
@@ -3,7 +3,8 @@ module PgSync
|
|
3
3
|
COLOR_CODES = {
|
4
4
|
red: 31,
|
5
5
|
green: 32,
|
6
|
-
yellow: 33
|
6
|
+
yellow: 33,
|
7
|
+
cyan: 36
|
7
8
|
}
|
8
9
|
|
9
10
|
def log(message = nil)
|
@@ -59,7 +60,7 @@ module PgSync
|
|
59
60
|
end
|
60
61
|
|
61
62
|
def quote_ident_full(ident)
|
62
|
-
if ident.is_a?(Table)
|
63
|
+
if ident.is_a?(Table) || ident.is_a?(Sequence)
|
63
64
|
[quote_ident(ident.schema), quote_ident(ident.name)].join(".")
|
64
65
|
else # temp table names are strings
|
65
66
|
quote_ident(ident)
|
data/lib/pgsync/version.rb
CHANGED
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: pgsync
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.6.
|
4
|
+
version: 0.6.5
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Andrew Kane
|
8
8
|
autorequire:
|
9
9
|
bindir: exe
|
10
10
|
cert_chain: []
|
11
|
-
date: 2020-
|
11
|
+
date: 2020-07-10 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: parallel
|
@@ -44,14 +44,14 @@ dependencies:
|
|
44
44
|
requirements:
|
45
45
|
- - ">="
|
46
46
|
- !ruby/object:Gem::Version
|
47
|
-
version: 4.8.
|
47
|
+
version: 4.8.2
|
48
48
|
type: :runtime
|
49
49
|
prerelease: false
|
50
50
|
version_requirements: !ruby/object:Gem::Requirement
|
51
51
|
requirements:
|
52
52
|
- - ">="
|
53
53
|
- !ruby/object:Gem::Version
|
54
|
-
version: 4.8.
|
54
|
+
version: 4.8.2
|
55
55
|
- !ruby/object:Gem::Dependency
|
56
56
|
name: tty-spinner
|
57
57
|
requirement: !ruby/object:Gem::Requirement
|
@@ -125,6 +125,7 @@ files:
|
|
125
125
|
- lib/pgsync/data_source.rb
|
126
126
|
- lib/pgsync/init.rb
|
127
127
|
- lib/pgsync/schema_sync.rb
|
128
|
+
- lib/pgsync/sequence.rb
|
128
129
|
- lib/pgsync/sync.rb
|
129
130
|
- lib/pgsync/table.rb
|
130
131
|
- lib/pgsync/table_sync.rb
|