postgres_upsert 2.0.0 → 5.1.0

Sign up to get free protection for your applications and to get access to all the features.
Files changed (58) hide show
  1. checksums.yaml +5 -5
  2. data/.gitignore +3 -0
  3. data/.rubocop.yml +57 -0
  4. data/.ruby-gemset +1 -0
  5. data/.ruby-version +1 -0
  6. data/.travis.yml +20 -0
  7. data/Gemfile +1 -2
  8. data/Gemfile.lock +175 -85
  9. data/README.md +117 -41
  10. data/Rakefile +4 -16
  11. data/app/assets/config/manifest.js +0 -0
  12. data/bin/bundle +3 -0
  13. data/bin/rails +4 -0
  14. data/bin/rake +4 -0
  15. data/bin/setup +56 -0
  16. data/config.ru +4 -0
  17. data/config/application.rb +21 -0
  18. data/config/boot.rb +3 -0
  19. data/config/database.yml +24 -0
  20. data/config/database.yml.travis +23 -0
  21. data/config/environment.rb +5 -0
  22. data/config/environments/development.rb +41 -0
  23. data/config/environments/production.rb +79 -0
  24. data/config/environments/test.rb +42 -0
  25. data/config/locales/en.yml +23 -0
  26. data/config/routes.rb +56 -0
  27. data/config/secrets.yml +22 -0
  28. data/db/migrate/20150214192135_create_test_tables.rb +24 -0
  29. data/db/migrate/20150710162236_create_composite_models_table.rb +9 -0
  30. data/db/schema.rb +48 -0
  31. data/db/seeds.rb +7 -0
  32. data/lib/postgres_upsert.rb +38 -6
  33. data/lib/postgres_upsert/model_to_model_adapter.rb +37 -0
  34. data/lib/postgres_upsert/read_adapters/active_record_adapter.rb +37 -0
  35. data/lib/postgres_upsert/read_adapters/file_adapter.rb +42 -0
  36. data/lib/postgres_upsert/read_adapters/io_adapter.rb +42 -0
  37. data/lib/postgres_upsert/result.rb +23 -0
  38. data/lib/postgres_upsert/table_writer.rb +48 -0
  39. data/lib/postgres_upsert/write_adapters/active_record_adapter.rb +36 -0
  40. data/lib/postgres_upsert/write_adapters/table_adapter.rb +56 -0
  41. data/lib/postgres_upsert/writer.rb +130 -92
  42. data/postgres_upsert.gemspec +7 -4
  43. data/spec/composite_key_spec.rb +50 -0
  44. data/spec/fixtures/comma_with_header_duplicate.csv +3 -0
  45. data/spec/fixtures/composite_key_model.rb +4 -0
  46. data/spec/fixtures/composite_key_with_header.csv +3 -0
  47. data/spec/fixtures/composite_nonkey_with_header.csv +3 -0
  48. data/spec/fixtures/test_model_copy.rb +4 -0
  49. data/spec/from_table_spec.rb +40 -0
  50. data/spec/pg_upsert_csv_spec.rb +93 -35
  51. data/spec/rails_helper.rb +1 -0
  52. data/spec/spec_helper.rb +9 -37
  53. metadata +106 -37
  54. data/VERSION +0 -1
  55. data/lib/postgres_upsert/active_record.rb +0 -13
  56. data/spec/fixtures/2_col_binary_data.dat +0 -0
  57. data/spec/pg_upsert_binary_spec.rb +0 -35
  58. data/spec/spec.opts +0 -1
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
- SHA1:
3
- metadata.gz: 363018cec57166d2976cbeedc60fad9dc17e5ac7
4
- data.tar.gz: 280a43ba9f6dea9a72111031c5c7e82089185f72
2
+ SHA256:
3
+ metadata.gz: 89478e281fa4cd37be98385dab13c96de8f4128702e4a9cfd1c7f8049d912fb8
4
+ data.tar.gz: b20dc2e32478209af409f86299ea5c702be7c7ec8cf87d7e2dd2035afe27130a
5
5
  SHA512:
6
- metadata.gz: 195008d31407158e4cecf27fbd1aec45b6b3c17a078efe5dcd567e4ec8d023f187d597cf0b1a5a1e46e53c901e957a7c70b6ceccdd1aa27231f1cd2f727b194d
7
- data.tar.gz: fbf6e4e15fc23ab1c344a1a8bdd9ba80733cc77c776b6eed98c05936a0dd6cd4889f387e43f028c8c380a2e7d4beb9f3ecc58ef5374183ff3891821077e0fb37
6
+ metadata.gz: 4349cbfbeeeeb5bdd4b95086f693e19968319f9a3b4a2bf0dbb290789b9e9893496463a08335dd2e6f9f2d7c0717036cdabfdbca00e1d2ab394beb143bbd2d45
7
+ data.tar.gz: 8abcd0a2167f3ed42ed6fa1c70c145203a565a8a1418ea3f265a1b81a8267e22aa321b544d350d8c711d420d53c0263c9487bfc8678e5499dad1363475ee2cd0
data/.gitignore CHANGED
@@ -8,6 +8,9 @@
8
8
  /test/tmp/
9
9
  /test/version_tmp/
10
10
  /tmp/
11
+ /log
12
+ .DS_Store
13
+ */.Ds_Store
11
14
 
12
15
  ## Specific to RubyMotion:
13
16
  .dat*
data/.rubocop.yml ADDED
@@ -0,0 +1,57 @@
1
+ AllCops:
2
+ Exclude:
3
+ - '.*'
4
+ - 'app/*'
5
+ - 'bin/*'
6
+ - 'config*'
7
+ - 'db/*'
8
+ - '*.md'
9
+ - 'Gemfile'
10
+ - 'Gemfile.lock'
11
+
12
+ Layout/Tab:
13
+ Enabled: false
14
+
15
+ Layout/IndentationWidth:
16
+ Enabled: false
17
+
18
+ Metrics/LineLength:
19
+ Max: 99
20
+
21
+ Metrics/MethodLength:
22
+ CountAsOne: ['array', 'hash', 'heredoc']
23
+
24
+ Style/RescueModifier:
25
+ Enabled: false
26
+
27
+ Layout/TrailingEmptyLines:
28
+ Enabled: false
29
+
30
+ Layout/AccessModifierIndentation:
31
+ Enabled: false
32
+
33
+ Style/BlockComments:
34
+ Exclude:
35
+ - 'spec/spec_helper.rb'
36
+
37
+ Style/Documentation:
38
+ Enabled: false
39
+
40
+ Style/FrozenStringLiteralComment:
41
+ Enabled: false
42
+
43
+ Style/HashEachMethods:
44
+ Enabled: true
45
+
46
+ Style/HashTransformKeys:
47
+ Enabled: true
48
+
49
+ Style/HashTransformValues:
50
+ Enabled: true
51
+
52
+ Metrics/BlockLength:
53
+ Exclude:
54
+ - 'spec/*'
55
+
56
+ Layout/MultilineMethodCallIndentation:
57
+ Enabled: false
data/.ruby-gemset ADDED
@@ -0,0 +1 @@
1
+ postgres_upsert
data/.ruby-version ADDED
@@ -0,0 +1 @@
1
+ 2.6.5
data/.travis.yml ADDED
@@ -0,0 +1,20 @@
1
+ language: ruby
2
+ dist: trusty
3
+ before_install:
4
+ - gpg2 --recv-keys 409B6B1796C275462A1703113804BB82D39DC0E3 7D2BAF1CF37B13E2069D6956105BD0E739499BDB
5
+ - gem install bundler
6
+ - cp config/database.yml.travis config/database.yml
7
+ - psql -c 'create database ar_pg_copy_test;' -U postgres
8
+
9
+ rvm:
10
+ - 2.5.0
11
+
12
+ script:
13
+ - gpg2 --recv-keys 409B6B1796C275462A1703113804BB82D39DC0E3 7D2BAF1CF37B13E2069D6956105BD0E739499BDB
14
+ - bundle exec rake db:setup
15
+ - bundle exec rspec spec
16
+ services:
17
+ postgresql
18
+ addons:
19
+ postgresql: "9.4"
20
+
data/Gemfile CHANGED
@@ -1,5 +1,4 @@
1
1
  source 'https://rubygems.org'
2
2
 
3
- # specify gem dependencies in activerecord-postgres-hstore.gemspec
4
- # except the platform-specific dependencies below
3
+ # specify gem dependencies in postgres_upsert.gemspec
5
4
  gemspec
data/Gemfile.lock CHANGED
@@ -1,112 +1,202 @@
1
1
  PATH
2
2
  remote: .
3
3
  specs:
4
- postgres_upsert (1.1.0)
4
+ postgres_upsert (5.1.0)
5
5
  activerecord (>= 3.0.0)
6
- pg (~> 0.17.0)
6
+ pg (>= 0.17.0)
7
7
  rails (>= 3.0.0)
8
8
 
9
9
  GEM
10
10
  remote: https://rubygems.org/
11
11
  specs:
12
- actionmailer (4.0.3)
13
- actionpack (= 4.0.3)
14
- mail (~> 2.5.4)
15
- actionpack (4.0.3)
16
- activesupport (= 4.0.3)
17
- builder (~> 3.1.0)
18
- erubis (~> 2.7.0)
19
- rack (~> 1.5.2)
20
- rack-test (~> 0.6.2)
21
- activemodel (4.0.3)
22
- activesupport (= 4.0.3)
23
- builder (~> 3.1.0)
24
- activerecord (4.0.3)
25
- activemodel (= 4.0.3)
26
- activerecord-deprecated_finders (~> 1.0.2)
27
- activesupport (= 4.0.3)
28
- arel (~> 4.0.0)
29
- activerecord-deprecated_finders (1.0.3)
30
- activesupport (4.0.3)
31
- i18n (~> 0.6, >= 0.6.4)
32
- minitest (~> 4.2)
33
- multi_json (~> 1.3)
34
- thread_safe (~> 0.1)
35
- tzinfo (~> 0.3.37)
36
- arel (4.0.2)
37
- builder (3.1.4)
12
+ actioncable (6.1.3.2)
13
+ actionpack (= 6.1.3.2)
14
+ activesupport (= 6.1.3.2)
15
+ nio4r (~> 2.0)
16
+ websocket-driver (>= 0.6.1)
17
+ actionmailbox (6.1.3.2)
18
+ actionpack (= 6.1.3.2)
19
+ activejob (= 6.1.3.2)
20
+ activerecord (= 6.1.3.2)
21
+ activestorage (= 6.1.3.2)
22
+ activesupport (= 6.1.3.2)
23
+ mail (>= 2.7.1)
24
+ actionmailer (6.1.3.2)
25
+ actionpack (= 6.1.3.2)
26
+ actionview (= 6.1.3.2)
27
+ activejob (= 6.1.3.2)
28
+ activesupport (= 6.1.3.2)
29
+ mail (~> 2.5, >= 2.5.4)
30
+ rails-dom-testing (~> 2.0)
31
+ actionpack (6.1.3.2)
32
+ actionview (= 6.1.3.2)
33
+ activesupport (= 6.1.3.2)
34
+ rack (~> 2.0, >= 2.0.9)
35
+ rack-test (>= 0.6.3)
36
+ rails-dom-testing (~> 2.0)
37
+ rails-html-sanitizer (~> 1.0, >= 1.2.0)
38
+ actiontext (6.1.3.2)
39
+ actionpack (= 6.1.3.2)
40
+ activerecord (= 6.1.3.2)
41
+ activestorage (= 6.1.3.2)
42
+ activesupport (= 6.1.3.2)
43
+ nokogiri (>= 1.8.5)
44
+ actionview (6.1.3.2)
45
+ activesupport (= 6.1.3.2)
46
+ builder (~> 3.1)
47
+ erubi (~> 1.4)
48
+ rails-dom-testing (~> 2.0)
49
+ rails-html-sanitizer (~> 1.1, >= 1.2.0)
50
+ activejob (6.1.3.2)
51
+ activesupport (= 6.1.3.2)
52
+ globalid (>= 0.3.6)
53
+ activemodel (6.1.3.2)
54
+ activesupport (= 6.1.3.2)
55
+ activerecord (6.1.3.2)
56
+ activemodel (= 6.1.3.2)
57
+ activesupport (= 6.1.3.2)
58
+ activestorage (6.1.3.2)
59
+ actionpack (= 6.1.3.2)
60
+ activejob (= 6.1.3.2)
61
+ activerecord (= 6.1.3.2)
62
+ activesupport (= 6.1.3.2)
63
+ marcel (~> 1.0.0)
64
+ mini_mime (~> 1.0.2)
65
+ activesupport (6.1.3.2)
66
+ concurrent-ruby (~> 1.0, >= 1.0.2)
67
+ i18n (>= 1.6, < 2)
68
+ minitest (>= 5.1)
69
+ tzinfo (~> 2.0)
70
+ zeitwerk (~> 2.3)
71
+ ast (2.4.1)
72
+ builder (3.2.4)
38
73
  coderay (1.1.0)
39
- diff-lcs (1.1.3)
40
- erubis (2.7.0)
41
- hike (1.2.3)
42
- i18n (0.6.11)
43
- json (1.7.6)
44
- mail (2.5.4)
45
- mime-types (~> 1.16)
46
- treetop (~> 1.4.8)
74
+ concurrent-ruby (1.1.8)
75
+ crass (1.0.6)
76
+ database_cleaner-active_record (2.0.1)
77
+ activerecord (>= 5.a)
78
+ database_cleaner-core (~> 2.0.0)
79
+ database_cleaner-core (2.0.1)
80
+ diff-lcs (1.4.4)
81
+ erubi (1.10.0)
82
+ globalid (0.4.2)
83
+ activesupport (>= 4.2.0)
84
+ i18n (1.8.10)
85
+ concurrent-ruby (~> 1.0)
86
+ jaro_winkler (1.5.4)
87
+ loofah (2.9.1)
88
+ crass (~> 1.0.2)
89
+ nokogiri (>= 1.5.9)
90
+ mail (2.7.1)
91
+ mini_mime (>= 0.1.1)
92
+ marcel (1.0.1)
47
93
  method_source (0.8.2)
48
- mime-types (1.25.1)
49
- minitest (4.7.5)
50
- multi_json (1.10.1)
51
- pg (0.17.1)
52
- polyglot (0.3.5)
94
+ mini_mime (1.0.3)
95
+ mini_portile2 (2.5.1)
96
+ minitest (5.14.4)
97
+ nio4r (2.5.7)
98
+ nokogiri (1.11.3)
99
+ mini_portile2 (~> 2.5.0)
100
+ racc (~> 1.4)
101
+ parallel (1.19.2)
102
+ parser (2.7.1.3)
103
+ ast (~> 2.4.0)
104
+ pg (1.2.3)
53
105
  pry (0.10.1)
54
106
  coderay (~> 1.1.0)
55
107
  method_source (~> 0.8.1)
56
108
  slop (~> 3.4)
109
+ pry-nav (0.3.0)
110
+ pry (>= 0.9.10, < 0.13.0)
57
111
  pry-rails (0.3.2)
58
112
  pry (>= 0.9.10)
59
- rack (1.5.2)
60
- rack-test (0.6.2)
61
- rack (>= 1.0)
62
- rails (4.0.3)
63
- actionmailer (= 4.0.3)
64
- actionpack (= 4.0.3)
65
- activerecord (= 4.0.3)
66
- activesupport (= 4.0.3)
67
- bundler (>= 1.3.0, < 2.0)
68
- railties (= 4.0.3)
69
- sprockets-rails (~> 2.0.0)
70
- railties (4.0.3)
71
- actionpack (= 4.0.3)
72
- activesupport (= 4.0.3)
113
+ racc (1.5.2)
114
+ rack (2.2.3)
115
+ rack-test (1.1.0)
116
+ rack (>= 1.0, < 3)
117
+ rails (6.1.3.2)
118
+ actioncable (= 6.1.3.2)
119
+ actionmailbox (= 6.1.3.2)
120
+ actionmailer (= 6.1.3.2)
121
+ actionpack (= 6.1.3.2)
122
+ actiontext (= 6.1.3.2)
123
+ actionview (= 6.1.3.2)
124
+ activejob (= 6.1.3.2)
125
+ activemodel (= 6.1.3.2)
126
+ activerecord (= 6.1.3.2)
127
+ activestorage (= 6.1.3.2)
128
+ activesupport (= 6.1.3.2)
129
+ bundler (>= 1.15.0)
130
+ railties (= 6.1.3.2)
131
+ sprockets-rails (>= 2.0.0)
132
+ rails-dom-testing (2.0.3)
133
+ activesupport (>= 4.2.0)
134
+ nokogiri (>= 1.6)
135
+ rails-html-sanitizer (1.3.0)
136
+ loofah (~> 2.3)
137
+ railties (6.1.3.2)
138
+ actionpack (= 6.1.3.2)
139
+ activesupport (= 6.1.3.2)
140
+ method_source
73
141
  rake (>= 0.8.7)
74
- thor (>= 0.18.1, < 2.0)
75
- rake (10.3.2)
76
- rdoc (3.12)
77
- json (~> 1.4)
78
- rspec (2.12.0)
79
- rspec-core (~> 2.12.0)
80
- rspec-expectations (~> 2.12.0)
81
- rspec-mocks (~> 2.12.0)
82
- rspec-core (2.12.2)
83
- rspec-expectations (2.12.1)
84
- diff-lcs (~> 1.1.3)
85
- rspec-mocks (2.12.2)
142
+ thor (~> 1.0)
143
+ rainbow (3.0.0)
144
+ rake (13.0.3)
145
+ rexml (3.2.4)
146
+ rspec-core (3.10.1)
147
+ rspec-support (~> 3.10.0)
148
+ rspec-expectations (3.10.1)
149
+ diff-lcs (>= 1.2.0, < 2.0)
150
+ rspec-support (~> 3.10.0)
151
+ rspec-mocks (3.10.2)
152
+ diff-lcs (>= 1.2.0, < 2.0)
153
+ rspec-support (~> 3.10.0)
154
+ rspec-rails (5.0.1)
155
+ actionpack (>= 5.2)
156
+ activesupport (>= 5.2)
157
+ railties (>= 5.2)
158
+ rspec-core (~> 3.10)
159
+ rspec-expectations (~> 3.10)
160
+ rspec-mocks (~> 3.10)
161
+ rspec-support (~> 3.10)
162
+ rspec-support (3.10.2)
163
+ rubocop (0.80.1)
164
+ jaro_winkler (~> 1.5.1)
165
+ parallel (~> 1.10)
166
+ parser (>= 2.7.0.1)
167
+ rainbow (>= 2.2.2, < 4.0)
168
+ rexml
169
+ ruby-progressbar (~> 1.7)
170
+ unicode-display_width (>= 1.4.0, < 1.7)
171
+ ruby-progressbar (1.10.1)
86
172
  slop (3.6.0)
87
- sprockets (2.11.0)
88
- hike (~> 1.2)
89
- multi_json (~> 1.0)
90
- rack (~> 1.0)
91
- tilt (~> 1.1, != 1.3.0)
92
- sprockets-rails (2.0.1)
93
- actionpack (>= 3.0)
94
- activesupport (>= 3.0)
95
- sprockets (~> 2.8)
96
- thor (0.19.1)
97
- thread_safe (0.3.4)
98
- tilt (1.4.1)
99
- treetop (1.4.15)
100
- polyglot
101
- polyglot (>= 0.3.1)
102
- tzinfo (0.3.40)
173
+ sprockets (4.0.2)
174
+ concurrent-ruby (~> 1.0)
175
+ rack (> 1, < 3)
176
+ sprockets-rails (3.2.2)
177
+ actionpack (>= 4.0)
178
+ activesupport (>= 4.0)
179
+ sprockets (>= 3.0.0)
180
+ thor (1.1.0)
181
+ tzinfo (2.0.4)
182
+ concurrent-ruby (~> 1.0)
183
+ unicode-display_width (1.6.1)
184
+ websocket-driver (0.7.3)
185
+ websocket-extensions (>= 0.1.0)
186
+ websocket-extensions (0.1.5)
187
+ zeitwerk (2.4.2)
103
188
 
104
189
  PLATFORMS
105
190
  ruby
106
191
 
107
192
  DEPENDENCIES
108
193
  bundler
194
+ database_cleaner-active_record
109
195
  postgres_upsert!
196
+ pry-nav
110
197
  pry-rails
111
- rdoc
112
- rspec (~> 2.12)
198
+ rspec-rails (>= 3.9)
199
+ rubocop
200
+
201
+ BUNDLED WITH
202
+ 2.2.3
data/README.md CHANGED
@@ -1,9 +1,11 @@
1
- # postgres_upsert
1
+ # postgres_upsert [![Build Status](https://travis-ci.org/theSteveMitchell/postgres_upsert.svg?branch=master)](https://travis-ci.org/theSteveMitchell/postgres_upsert)
2
2
 
3
3
  Allows your rails app to load data in a very fast way, avoiding calls to ActiveRecord.
4
4
 
5
5
  Using the PG gem and postgres's powerful COPY command, you can create thousands of rails objects in your db in a single query.
6
6
 
7
+ ## Compatibility Note
8
+ The master branch requires the 'pg' gem which only supports MRI ruby. the jruby branch requires 'activerecord-jdbcpostgresql-adapter' which, of course only supports JRuby. Installation is the same whatever your platform.
7
9
 
8
10
  ## Install
9
11
 
@@ -17,76 +19,150 @@ Run the bundle command
17
19
 
18
20
  ## Usage
19
21
 
20
- The gem will add the aditiontal class method to ActiveRecord::Base
21
-
22
- * pg_upsert io_object_or_file_path, [options]
23
-
24
- io_object_or_file_path => is a file path or an io object (StringIO, FileIO, etc.)
22
+ ```ruby
23
+ PostgresUpsert.write <class_or_table_name>, <io_object_or_file_path>[, options]
24
+ ```
25
+ <class_or_table_name> is either an ActiveRecord::Base subclass, or a string representing the name of a database table.
26
+ <io_object_or_file_path> can be either a string representing a file path, or an io object (StringIO, FileIO, etc.)
25
27
 
26
28
  options:
27
- :delimiter - the string to use to delimit fields. Default is ","
28
- :format - the format of the file (valid formats are :csv or :binary). Default is :csv
29
- :header => specifies if the file/io source contains a header row. Either :header option must be true, or :columns list must be passed. Default true
30
- :key_column => the primary key or unique key column on your ActiveRecord table, used to distinguish new records from existing records. Default is the primary_key of your ActiveRecord model class.
31
- :update_only => when true, postgres_upsert will ONLY update existing records, and not insert new. Default is false.
32
-
33
- pg_upsert will allow you to copy data from an arbritary IO object or from a file in the database server (when you pass the path as string).
34
- Let's first copy from a file in the database server, assuming again that we have a users table and
35
- that we are in the Rails console:
29
+ - :delimiter - the string to use to delimit fields from the source data. Default is ","
30
+ - :header => specifies if the file/io source contains a header row. Either :header option must be true, or :columns list must be passed. Default true
31
+ - :unique_key => the primary key or unique key column (or composite key columns) on your destination table, used to distinguish new records from existing records. Default is the primary_key of your destination table/model.
32
+ - :update_only => when true, postgres_upsert will ONLY update existing records, and not insert new. Default is false.
36
33
 
34
+ ## Examples
35
+ for these examples let's assume we have a users table and model:
37
36
  ```ruby
38
- User.pg_upsert "/tmp/users.csv"
37
+ class User < ActiveRecord::Base
38
+ ```
39
+ In the rails console we can run:
40
+ ```ruby
41
+ PostgresUpsert.write User, "/tmp/users.csv"
39
42
  ```
40
43
 
41
- This command will use the headers in the CSV file as fields of the target table, so beware to always have a header in the files you want to import.
42
- If the column names in the CSV header do not match the field names of the target table, you can pass a map in the options parameter.
43
-
44
+ This command will use the headers in the CSV file as fields of the target table (by default)
45
+ If the CSV file's header does not match the field names of the User class, you can pass a map in the options parameter.
44
46
  ```ruby
45
- User.pg_upsert "/tmp/users.csv", :map => {'name' => 'first_name'}
47
+ PostgresUpsert.write "users", "/tmp/users.csv", :map => {'name' => 'first_name'}
46
48
  ```
49
+ The `name` column in the CSV file will be mapped to the `first_name` field in the users table.
50
+
51
+ postgres_upsert supports 'merge' operations, which is not yet natively supported in Postgres. The data can include both new and existing records, and postgres_upsert will handle either update or insert of records appropriately. Since the Postgres COPY command does not handle this, postgres_upsert accomplishes it using an intermediary temp table.
52
+
53
+ The merge/upsert happens in 5 steps (assume your data table is called "users")
54
+ * create a temp table named users_temp_123 where "123" is a random int. In postgres temp tables are only visible to the current database session, so naming conflicts should not be a problem. We add this random suffix just for additional safety.
55
+ * COPY the data to user_temp
56
+ * issue a query to insert all new records from users_temp_123 into users ("new" records are those records whos primary key does not already exist in the users)
57
+ * issue a query to update all existing records in users with the data in users_temp_123 ("existing" records are those whose primary key already exists in the users table)
58
+ * drop the temp table.
59
+
60
+ ## timestamp columns
61
+
62
+ currently postgres_upsert detects and manages the default rails timestamp columns `created_at` and `updated_at`. If these fields exist in your destination table, postgres_upsert will keep these current as expected. I recommend you do NOT include these fields in your source CSV/IO, as postgres_upsert will not honor them.
63
+
64
+ * newly inserted records get a current timestamp for created_at
65
+ * records existing in the source file/IO will get an update to their updated_at timestamp (even if all fields maintain the same value)
66
+ * records that are in the destination table but not the source will not have their timestamps changed.
67
+
47
68
 
48
- In the above example the header name in the CSV file will be mapped to the field called first_name in the users table.
69
+ ### Overriding the unique_key
49
70
 
50
- To copy a binary formatted data file or IO object you can specify the format as binary
71
+ By default postgres_upsert uses the primary key on your ActiveRecord table to determine if each record should be inserted or updated. You can override the column using the :unique_key option:
51
72
 
52
73
  ```ruby
53
- User.pg_upsert "/tmp/users.dat", :format => :binary, :columns => ["id, "name"]
74
+ PostgresUpsert.write User "/tmp/users.csv", :unique_key => ["external_twitter_id"]
54
75
  ```
55
76
 
56
- Which will generate the following SQL command:
77
+ obviously, the field you pass must be a unique key in your database (this is not enforced at the moment, but will be)
78
+
79
+ If your source data does not contain the primary key, or an individual unique key, you can pass multiple columns in the unique_key option:
57
80
 
58
- ```sql
59
- COPY users (id, name) FROM '/tmp/users.dat' WITH BINARY
81
+ ```ruby
82
+ PostgresUpsert.write User "/tmp/users.csv", :unique_key => ["state_id_numer", "state_name"]
60
83
  ```
84
+ As long as combined columns represent a unique value, row, we can successfully upsert.
61
85
 
62
- NOTE: binary files do not include header columns, so passing a :columns array is required for binary files.
86
+ passing :update_only => true will ensure that no new records are created, but records will be updated.
63
87
 
88
+ ### Insert/Update Counts
89
+ PostgresUpsert with also return a PostgresUpsert::Result object that will tell you how many records were inserted or updated:
64
90
 
65
- pg_upsert supports 'upsert' or 'merge' operations. In other words, the data source can contain both new and existing objects, and pg_upsert will handle either case. Since the Postgres native COPY command does not handle updating existing records, pg_upsert accomplishes update and insert using an intermediary temp table:
91
+ ```ruby
92
+ User.delete_all
93
+ result = PostgresUpsert.write User "/tmp/users.csv"
94
+ result.inserted
95
+ # => 10000
96
+ result.updated
97
+ # => 0
98
+ ```
66
99
 
67
- This merge/upsert happend in 5 steps (assume your data table is called "users")
68
- * create a temp table named users_temp_### where "###" is a random number. In postgres temp tables are only visible to the current database session, so naming conflicts should not be a problem.
69
- * COPY the data to user_temp
70
- * issue a query to insert all new records from users_temp_### into users (newness is determined by the presence of the primary key in the users table)
71
- * issue a query to update all records in users with the data in users_temp_### (matching on primary key)
72
- * drop the temp table.
100
+ ### Huge Caveat!
101
+ Since postgres_upsert does not use validations or even instantiate rails objects, you can get invalid data if you're not careful. Postgres upsert assumes that your source data is minimally cleaned up, and will not tell you if any data is invalid based on rails model rules. It will, of course raise an error if data does not conform to your database constraints.
73
102
 
74
- ### overriding the key_column
103
+ ### Benchmarks!
75
104
 
76
- By default pg_upsert uses the primary key on your ActiveRecord table to determine if each record should be inserted or updated. You can override the column using the :key_field option:
105
+ Given a User model, (validates presence of email and paassword)
106
+ ```console
107
+ 2.1.3 :008 > User
108
+ => User(id: integer, email: string, password: string, created_at: datetime, updated_at: datetime)
109
+ ```
77
110
 
111
+ And the following railsy code to create 10,000 users:
78
112
  ```ruby
79
- User.pg_upsert "/tmp/users.dat", :format => :binary, :key_column => ["external_twitter_id"]
113
+ def insert_dumb
114
+ time = Benchmark.measure do
115
+ (1..10000).each do |n|
116
+ User.create!(:email => "number#{n}@email.com", :password => "password#{n)}")
117
+ end
118
+ end
119
+ puts time
120
+ end
80
121
  ```
81
122
 
82
- obviously, the field you pass must be a unique key in your database (this is not enforced at the moment, but will be)
123
+ Compared to the following code using Postgres_upsert:
124
+ ```ruby
125
+ def insert_smart
126
+ time = Benchmark.measure do
127
+ csv_string = CSV.generate do |csv|
128
+ csv << %w(email password) # CSV header row
129
+ (1..10000).each do |n|
130
+ csv << ["number#{n}@email.com", "password#{n)}"]
131
+ end
132
+ end
133
+ io = StringIO.new(csv_string)
134
+ PostgresUpsert.write User io, unique_key: "email"
135
+ end
136
+ puts time
137
+ end
138
+ ```
83
139
 
84
- passing :update_only = true will ensure that no new records are created, but records will be updated.
140
+ let's compare!
141
+
142
+ ```console
143
+ 2.1.3 :002 > insert_dumb
144
+ #...snip ~30k lines of output :( (10k queries, each wrapped in a transaction)
145
+ (0.3ms) COMMIT
146
+ 26.639246
147
+ 2.1.3 :004 > User.delete_all
148
+ SQL (15.4ms) DELETE FROM "users"
149
+ 2.1.3 :006 > insert_smart
150
+ #...snip ~30 lines of output, composing 5 sql queries...
151
+ 0.275503
152
+ ```
153
+
154
+ ...That's 26.6 seconds for classic create loop... vs. 0.276 seconds for postgres_upsert.
155
+ This is over 96X faster. And it only cost me ~6 extra lines of code.
156
+
157
+ Note that for the benchmark, my database is local. The performance improvement should only increase when we have network latency to worry about.
85
158
 
86
159
  ## Note on Patches/Pull Requests
87
160
 
88
- * Fork the project
89
- * add your feature/fix to your fork(rpsec tests pleaze)
161
+ I greatly appreciate contribution to this gem.
162
+
163
+ * Fork the project and clone the repo locally
164
+ * run 'bin/setup' to setup dependencies and create test DB
165
+ * add your feature/fix to your fork(add and run rpsec tests please)
90
166
  * submit a PR
91
167
  * If you find an issue but can't fix in in a PR, please log an issue. I'll do my best.
92
168