pgsync 0.5.4 → 0.6.3

Sign up to get free protection for your applications and to get access to all the features.

Potentially problematic release.


This version of pgsync might be problematic. Click here for more details.

checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: aa1085a96bdf17dcf1d5e240276faf9d26ee31a4c454e6bb042d9b6e919c6191
4
- data.tar.gz: c1e3d4be165bf9365d8faedee4cf7177908c1f4565342467691842d376f96d98
3
+ metadata.gz: 0d3ff88829338544bfe3434a16a00640e6496b7b89b1ef5dd43e66f844ea51ce
4
+ data.tar.gz: d5457e8c2596fdda651c9739712fe0934af14028295ea9cd6addfc70438ced16
5
5
  SHA512:
6
- metadata.gz: 68c7e7820c9e57e0f08f4f46030b9593b274bb1b731926d7c861d59424767c8dc650cf67a5849db463600392d36059fcfa451fe9c2e408213e35463c71f3cc6a
7
- data.tar.gz: 66080f837d39782139fd59ca60edf7bc18cfc5f37e5f8b76ff8d5b2bfced3fbd2f542cf08775e78d06872eda0ed887cf64993a2b955260eee29b3ed8892e3686
6
+ metadata.gz: 6f9549d0d85cb502b5b88a3178fe617bf94f032d72787ab6dcf8f6097dc256ca2e969e2c4a434ebf33e4de17a4f11c590c57ef8cae420b20fe4f39bfed795baa
7
+ data.tar.gz: b10ac04af7dc26c3067c2ff73f163c69d73dfb5c1f559585adcf774312ac78ac07dcd790e1706cc571a90f0e4573060cedf01fc6ebd8b1732fbaa16b486213dd
@@ -1,3 +1,41 @@
1
+ ## 0.6.3 (2020-06-09)
2
+
3
+ - Added `--defer-constraints-v2` option
4
+ - Ensure consistent source snapshot with `--disable-integrity`
5
+
6
+ ## 0.6.2 (2020-06-09)
7
+
8
+ - Added support for `--disable-integrity` on Amazon RDS
9
+ - Fixed error when excluded table not found in source
10
+
11
+ ## 0.6.1 (2020-06-07)
12
+
13
+ - Added Django and Laravel integrations
14
+
15
+ ## 0.6.0 (2020-06-07)
16
+
17
+ - Added messages for different column types and non-deferrable constraints
18
+ - Added support for wildcards to `--exclude`
19
+ - Improved `--overwrite` and `--preserve` options for foreign keys
20
+ - Improved output for schema sync
21
+ - Fixed `--overwrite` and `--preserve` options for multicolumn primary keys
22
+ - Fixed output for notices
23
+
24
+ Breaking
25
+
26
+ - Syncs shared tables instead of raising an error when tables missing in destination
27
+ - Raise an error when `--config` or `--db` option provided and config not found
28
+ - Removed deprecated options
29
+ - Dropped support for Postgres < 9.5
30
+
31
+ ## 0.5.5 (2020-05-13)
32
+
33
+ - Added `--jobs` option
34
+ - Added `--defer-constraints` option
35
+ - Added `--disable-user-triggers` option
36
+ - Added `--disable-integrity` option
37
+ - Improved error message for older libpq
38
+
1
39
  ## 0.5.4 (2020-05-09)
2
40
 
3
41
  - Fixed output for `--in-batches`
@@ -1,6 +1,6 @@
1
1
  The MIT License (MIT)
2
2
 
3
- Copyright (c) 2015-2019 Andrew Kane
3
+ Copyright (c) 2015-2020 Andrew Kane
4
4
 
5
5
  Permission is hereby granted, free of charge, to any person obtaining a copy
6
6
  of this software and associated documentation files (the "Software"), to deal
data/README.md CHANGED
@@ -33,20 +33,26 @@ This creates `.pgsync.yml` for you to customize. We recommend checking this into
33
33
 
34
34
  ## How to Use
35
35
 
36
- Sync all tables
36
+ First, make sure your schema is set up in both databases. We recommend using a schema migration tool for this, but pgsync also provides a few [convenience methods](#schema). Once that’s done, you’re ready to sync data.
37
+
38
+ Sync tables
37
39
 
38
40
  ```sh
39
41
  pgsync
40
42
  ```
41
43
 
42
- **Note:** pgsync assumes your schema is setup in your `to` database. See the [schema section](#schema) if that’s not the case.
43
-
44
44
  Sync specific tables
45
45
 
46
46
  ```sh
47
47
  pgsync table1,table2
48
48
  ```
49
49
 
50
+ Works with wildcards as well
51
+
52
+ ```sh
53
+ pgsync "table*"
54
+ ```
55
+
50
56
  Sync specific rows (existing rows are overwritten)
51
57
 
52
58
  ```sh
@@ -65,13 +71,15 @@ Or truncate them
65
71
  pgsync products "where store_id = 1" --truncate
66
72
  ```
67
73
 
68
- ### Exclude Tables
74
+ ## Tables
75
+
76
+ Exclude specific tables
69
77
 
70
78
  ```sh
71
- pgsync --exclude users
79
+ pgsync --exclude table1,table2
72
80
  ```
73
81
 
74
- To always exclude, add to `.pgsync.yml`.
82
+ Add to `.pgsync.yml` to exclude by default
75
83
 
76
84
  ```yml
77
85
  exclude:
@@ -79,15 +87,17 @@ exclude:
79
87
  - table2
80
88
  ```
81
89
 
82
- For Rails, you probably want to exclude schema migrations and ActiveRecord metadata.
90
+ Sync tables from all schemas or specific schemas (by default, only the search path is synced)
83
91
 
84
- ```yml
85
- exclude:
86
- - schema_migrations
87
- - ar_internal_metadata
92
+ ```sh
93
+ pgsync --all-schemas
94
+ # or
95
+ pgsync --schemas public,other
96
+ # or
97
+ pgsync public.table1,other.table2
88
98
  ```
89
99
 
90
- ### Groups
100
+ ## Groups
91
101
 
92
102
  Define groups in `.pgsync.yml`:
93
103
 
@@ -104,6 +114,8 @@ And run:
104
114
  pgsync group1
105
115
  ```
106
116
 
117
+ ## Variables
118
+
107
119
  You can also use groups to sync a specific record and associated records in other tables.
108
120
 
109
121
  To get product `123` with its reviews, last 10 coupons, and store, use:
@@ -123,18 +135,14 @@ And run:
123
135
  pgsync product:123
124
136
  ```
125
137
 
126
- ### Schema
127
-
128
- **Note:** pgsync is designed to sync data. You should use a schema migration tool to manage schema changes. The methods in this section are provided for convenience but not recommended.
138
+ ## Schema
129
139
 
130
- Sync schema before the data
140
+ Sync schema before the data (this wipes out existing data)
131
141
 
132
142
  ```sh
133
143
  pgsync --schema-first
134
144
  ```
135
145
 
136
- **Note:** This wipes out existing data
137
-
138
146
  Specify tables
139
147
 
140
148
  ```sh
@@ -149,13 +157,9 @@ pgsync --schema-only
149
157
 
150
158
  pgsync does not try to sync Postgres extensions.
151
159
 
152
- ## Data Protection
153
-
154
- Always make sure your [connection is secure](https://ankane.org/postgres-sslmode-explained) when connecting to a database over a network you don’t fully trust. Your best option is to connect over SSH or a VPN. Another option is to use `sslmode=verify-full`. If you don’t do this, your database credentials can be compromised.
155
-
156
- ## Sensitive Information
160
+ ## Sensitive Data
157
161
 
158
- Prevent sensitive information like email addresses from leaving the remote server.
162
+ Prevent sensitive data like email addresses from leaving the remote server.
159
163
 
160
164
  Define rules in `.pgsync.yml`:
161
165
 
@@ -188,7 +192,63 @@ Options for replacement are:
188
192
  - `null`
189
193
  - `untouched`
190
194
 
191
- Rules starting with `unique_` require the table to have a primary key. `unique_phone` requires a numeric primary key.
195
+ Rules starting with `unique_` require the table to have a single column primary key. `unique_phone` requires a numeric primary key.
196
+
197
+ ## Foreign Keys
198
+
199
+ Foreign keys can make it difficult to sync data. Three options are:
200
+
201
+ 1. Defer constraints (recommended)
202
+ 2. Manually specify the order of tables
203
+ 3. Disable foreign key triggers, which can silently break referential integrity (not recommended)
204
+
205
+ To defer constraints, use:
206
+
207
+ ```sh
208
+ pgsync --defer-constraints-v2
209
+ ```
210
+
211
+ To manually specify the order of tables, use `--jobs 1` so tables are synced one-at-a-time.
212
+
213
+ ```sh
214
+ pgsync table1,table2,table3 --jobs 1
215
+ ```
216
+
217
+ To disable foreign key triggers and potentially break referential integrity, use:
218
+
219
+ ```sh
220
+ pgsync --disable-integrity
221
+ ```
222
+
223
+ This requires superuser privileges on the `to` database. If syncing to (not from) Amazon RDS, use the `rds_superuser` role. If syncing to (not from) Heroku, there doesn’t appear to be a way to disable integrity.
224
+
225
+ ## Triggers
226
+
227
+ Disable user triggers with:
228
+
229
+ ```sh
230
+ pgsync --disable-user-triggers
231
+ ```
232
+
233
+ ## Append-Only Tables
234
+
235
+ For extremely large, append-only tables, sync in batches.
236
+
237
+ ```sh
238
+ pgsync large_table --in-batches
239
+ ```
240
+
241
+ The script will resume where it left off when run again, making it great for backfills.
242
+
243
+ ## Connection Security
244
+
245
+ Always make sure your [connection is secure](https://ankane.org/postgres-sslmode-explained) when connecting to a database over a network you don’t fully trust. Your best option is to connect over SSH or a VPN. Another option is to use `sslmode=verify-full`. If you don’t do this, your database credentials can be compromised.
246
+
247
+ ## Safety
248
+
249
+ To keep you from accidentally overwriting production, the destination is limited to `localhost` or `127.0.0.1` by default.
250
+
251
+ To use another host, add `to_safe: true` to your `.pgsync.yml`.
192
252
 
193
253
  ## Multiple Databases
194
254
 
@@ -204,31 +264,50 @@ This creates `.pgsync-db2.yml` for you to edit. Specify a database in commands w
204
264
  pgsync --db db2
205
265
  ```
206
266
 
207
- ## Safety
267
+ ## Integrations
208
268
 
209
- To keep you from accidentally overwriting production, the destination is limited to `localhost` or `127.0.0.1` by default.
269
+ - [Django](#django)
270
+ - [Heroku](#heroku)
271
+ - [Laravel](#laravel)
272
+ - [Rails](#rails)
210
273
 
211
- To use another host, add `to_safe: true` to your `.pgsync.yml`.
274
+ ### Django
275
+
276
+ If you run `pgsync --init` in a Django project, migrations will be excluded in `.pgsync.yml`.
277
+
278
+ ```yml
279
+ exclude:
280
+ - django_migrations
281
+ ```
212
282
 
213
- ## Large Tables
283
+ ### Heroku
214
284
 
215
- For extremely large tables, sync in batches.
285
+ If you run `pgsync --init` in a Heroku project, the `from` database will be set in `.pgsync.yml`.
216
286
 
217
- ```sh
218
- pgsync large_table --in-batches
287
+ ```yml
288
+ from: $(heroku config:get DATABASE_URL)?sslmode=require
219
289
  ```
220
290
 
221
- The script will resume where it left off when run again, making it great for backfills.
291
+ ### Laravel
222
292
 
223
- ## Foreign Keys
293
+ If you run `pgsync --init` in a Laravel project, migrations will be excluded in `.pgsync.yml`.
224
294
 
225
- By default, tables are copied in parallel. If you use foreign keys, this can cause violations. You can specify tables to be copied serially with:
295
+ ```yml
296
+ exclude:
297
+ - migrations
298
+ ```
226
299
 
227
- ```sh
228
- pgsync group1 --debug
300
+ ### Rails
301
+
302
+ If you run `pgsync --init` in a Rails project, Active Record metadata and schema migrations will be excluded in `.pgsync.yml`.
303
+
304
+ ```yml
305
+ exclude:
306
+ - ar_internal_metadata
307
+ - schema_migrations
229
308
  ```
230
309
 
231
- ## Reference
310
+ ## Other Commands
232
311
 
233
312
  Help
234
313
 
@@ -242,6 +321,12 @@ Version
242
321
  pgsync --version
243
322
  ```
244
323
 
324
+ List tables
325
+
326
+ ```sh
327
+ pgsync --list
328
+ ```
329
+
245
330
  ## Scripts
246
331
 
247
332
  Use groups when possible to take advantage of parallelism.
data/config.yml CHANGED
@@ -20,6 +20,10 @@ to: postgres://localhost:5432/myapp_development
20
20
  # - table1
21
21
  # - table2
22
22
 
23
+ # sync specific schemas
24
+ # schemas:
25
+ # - public
26
+
23
27
  # protect sensitive information
24
28
  data_rules:
25
29
  email: unique_email
@@ -10,15 +10,19 @@ require "shellwords"
10
10
  require "tempfile"
11
11
  require "uri"
12
12
  require "yaml"
13
+ require "open3"
13
14
 
14
15
  # modules
15
16
  require "pgsync/utils"
16
17
  require "pgsync/client"
17
18
  require "pgsync/data_source"
18
19
  require "pgsync/init"
20
+ require "pgsync/schema_sync"
19
21
  require "pgsync/sync"
20
- require "pgsync/table_list"
22
+ require "pgsync/table"
21
23
  require "pgsync/table_sync"
24
+ require "pgsync/task"
25
+ require "pgsync/task_resolver"
22
26
  require "pgsync/version"
23
27
 
24
28
  module PgSync
@@ -7,73 +7,75 @@ module PgSync
7
7
  output.sync = true
8
8
  end
9
9
 
10
- def perform(testing: true)
11
- opts = parse_args
10
+ def perform
11
+ result = Slop::Parser.new(slop_options).parse(@args)
12
+ arguments = result.arguments
13
+ options = result.to_h
12
14
 
13
- # TODO throw error in 0.6.0
14
- warn "Specify either --db or --config, not both" if opts[:db] && opts[:config]
15
+ raise Error, "Specify either --db or --config, not both" if options[:db] && options[:config]
16
+ raise Error, "Cannot use --overwrite with --in-batches" if options[:overwrite] && options[:in_batches]
15
17
 
16
- if opts.version?
18
+ if options[:version]
17
19
  log VERSION
18
- elsif opts.help?
19
- log opts
20
- # TODO remove deprecated conditions (last two)
21
- elsif opts.init? || opts.setup? || opts.arguments[0] == "setup"
22
- Init.new.perform(opts)
20
+ elsif options[:help]
21
+ log slop_options
22
+ elsif options[:init]
23
+ Init.new(arguments, options).perform
23
24
  else
24
- Sync.new.perform(opts)
25
+ Sync.new(arguments, options).perform
25
26
  end
26
- rescue Error, PG::ConnectionBad => e
27
- raise e if testing
28
- abort colorize(e.message, :red)
27
+ rescue => e
28
+ # Error, PG::ConnectionBad, Slop::Error
29
+ raise e if options && options[:debug]
30
+ abort colorize(e.message.strip, :red)
29
31
  end
30
32
 
31
33
  def self.start
32
- new(ARGV).perform(testing: false)
34
+ new(ARGV).perform
33
35
  end
34
36
 
35
37
  protected
36
38
 
37
- def parse_args
38
- Slop.parse(@args) do |o|
39
- o.banner = %{Usage:
40
- pgsync [options]
39
+ def slop_options
40
+ o = Slop::Options.new
41
+ o.banner = %{Usage:
42
+ pgsync [options]
41
43
 
42
44
  Options:}
43
- o.string "-d", "--db", "database"
44
- o.string "-t", "--tables", "tables to sync"
45
- o.string "-g", "--groups", "groups to sync"
46
- o.string "--schemas", "schemas to sync"
47
- o.string "--from", "source"
48
- o.string "--to", "destination"
49
- o.string "--where", "where", help: false
50
- o.integer "--limit", "limit", help: false
51
- o.string "--exclude", "exclude tables"
52
- o.string "--config", "config file"
53
- # TODO much better name for this option
54
- o.boolean "--to-safe", "accept danger", default: false
55
- o.boolean "--debug", "debug", default: false
56
- o.boolean "--list", "list", default: false
57
- o.boolean "--overwrite", "overwrite existing rows", default: false, help: false
58
- o.boolean "--preserve", "preserve existing rows", default: false
59
- o.boolean "--truncate", "truncate existing rows", default: false
60
- o.boolean "--schema-first", "schema first", default: false
61
- o.boolean "--schema-only", "schema only", default: false
62
- o.boolean "--all-schemas", "all schemas", default: false
63
- o.boolean "--no-rules", "do not apply data rules", default: false
64
- o.boolean "--no-sequences", "do not sync sequences", default: false
65
- o.boolean "--init", "init", default: false
66
- o.boolean "--setup", "setup", default: false, help: false
67
- o.boolean "--in-batches", "in batches", default: false, help: false
68
- o.integer "--batch-size", "batch size", default: 10000, help: false
69
- o.float "--sleep", "sleep", default: 0, help: false
70
- o.boolean "--fail-fast", "stop on the first failed table", default: false
71
- # o.array "--var", "pass a variable"
72
- o.boolean "-v", "--version", "print the version"
73
- o.boolean "-h", "--help", "prints help"
74
- end
75
- rescue Slop::Error => e
76
- raise Error, e.message
45
+ o.string "-d", "--db", "database"
46
+ o.string "-t", "--tables", "tables to sync"
47
+ o.string "-g", "--groups", "groups to sync"
48
+ o.integer "-j", "--jobs", "number of tables to sync at a time"
49
+ o.string "--schemas", "schemas to sync"
50
+ o.string "--from", "source"
51
+ o.string "--to", "destination"
52
+ o.string "--exclude", "exclude tables"
53
+ o.string "--config", "config file"
54
+ o.boolean "--to-safe", "accept danger", default: false
55
+ o.boolean "--debug", "debug", default: false
56
+ o.boolean "--list", "list", default: false
57
+ o.boolean "--overwrite", "overwrite existing rows", default: false, help: false
58
+ o.boolean "--preserve", "preserve existing rows", default: false
59
+ o.boolean "--truncate", "truncate existing rows", default: false
60
+ o.boolean "--schema-first", "schema first", default: false
61
+ o.boolean "--schema-only", "schema only", default: false
62
+ o.boolean "--all-schemas", "all schemas", default: false
63
+ o.boolean "--no-rules", "do not apply data rules", default: false
64
+ o.boolean "--no-sequences", "do not sync sequences", default: false
65
+ o.boolean "--init", "init", default: false
66
+ o.boolean "--in-batches", "in batches", default: false, help: false
67
+ o.integer "--batch-size", "batch size", default: 10000, help: false
68
+ o.float "--sleep", "sleep", default: 0, help: false
69
+ o.boolean "--fail-fast", "stop on the first failed table", default: false
70
+ o.boolean "--defer-constraints", "defer constraints", default: false, help: false
71
+ o.boolean "--defer-constraints-v2", "defer constraints", default: false
72
+ o.boolean "--disable-user-triggers", "disable non-system triggers", default: false
73
+ o.boolean "--disable-integrity", "disable foreign key triggers", default: false
74
+ # private, for testing
75
+ o.boolean "--disable-integrity-v2", "disable foreign key triggers", default: false, help: false
76
+ o.boolean "-v", "--version", "print the version"
77
+ o.boolean "-h", "--help", "prints help"
78
+ o
77
79
  end
78
80
  end
79
81
  end