upsert 0.2.0 → 0.2.1
Sign up to get free protection for your applications and to get access to all the features.
- data/CHANGELOG +5 -0
- data/README.md +31 -25
- data/lib/upsert/mysql2_client.rb +13 -2
- data/lib/upsert/version.rb +1 -1
- data/test/helper.rb +6 -2
- data/test/shared/reserved_words.rb +10 -8
- metadata +2 -1
data/CHANGELOG
ADDED
data/README.md
CHANGED
@@ -1,43 +1,53 @@
|
|
1
1
|
# Upsert
|
2
2
|
|
3
|
-
Finally, all those SQL MERGE tricks codified so that you can do "upsert" on MySQL, PostgreSQL, and
|
3
|
+
Finally, all those SQL MERGE tricks codified so that you can do "upsert" on MySQL, PostgreSQL, and SQLite.
|
4
4
|
|
5
|
-
|
5
|
+
You pass a selector that uniquely identifies a row, whether it exists or not. You pass a set of attributes that should be set on that row. Based on what database is being used, one of a number of SQL MERGE-like tricks are used.
|
6
6
|
|
7
|
-
|
7
|
+
The second argument is currently (mis)named a "document" because this was inspired by [mongo-ruby-driver's update method](http://api.mongodb.org/ruby/1.6.4/Mongo/Collection.html#update-instance_method).
|
8
8
|
|
9
|
-
|
10
|
-
|
11
|
-
|
12
|
-
end
|
9
|
+
## Usage
|
10
|
+
|
11
|
+
### One by one
|
13
12
|
|
14
|
-
|
13
|
+
Faster than just doing `Pet.create`... 85% faster on PostgreSQL, for example. But no validations or anything.
|
15
14
|
|
16
15
|
upsert = Upsert.new Pet.connection, Pet.table_name
|
17
|
-
|
18
|
-
|
19
|
-
upsert.row selector, document
|
16
|
+
upsert.row({:name => 'Jerry'}, :breed => 'beagle')
|
17
|
+
upsert.row({:name => 'Pierre'}, :breed => 'tabby')
|
20
18
|
|
21
|
-
### Streaming
|
19
|
+
### Streaming
|
22
20
|
|
23
|
-
Rows are buffered in memory until it's efficient to send them to the database.
|
21
|
+
Rows are buffered in memory until it's efficient to send them to the database. Currently this only provides an advantage on MySQL because it uses `ON DUPLICATE KEY UPDATE`... but if a similar method appears in PostgreSQL, the same code will still work.
|
24
22
|
|
25
23
|
Upsert.stream(Pet.connection, Pet.table_name) do |upsert|
|
26
|
-
# [...]
|
27
24
|
upsert.row({:name => 'Jerry'}, :breed => 'beagle')
|
28
|
-
# [...]
|
29
25
|
upsert.row({:name => 'Pierre'}, :breed => 'tabby')
|
30
|
-
# [...]
|
31
26
|
end
|
32
27
|
|
33
|
-
###
|
28
|
+
### `ActiveRecord::Base.upsert` (optional)
|
34
29
|
|
35
30
|
For bulk upserts, you probably still want to use `Upsert.stream`.
|
36
31
|
|
37
|
-
|
38
|
-
|
39
|
-
|
40
|
-
|
32
|
+
require 'upsert/active_record_upsert'
|
33
|
+
Pet.upsert({:name => 'Jerry'}, :breed => 'beagle')
|
34
|
+
Pet.upsert({:name => 'Pierre'}, :breed => 'tabby')
|
35
|
+
|
36
|
+
### Gotchas
|
37
|
+
|
38
|
+
Currently, the first row you pass in determines the columns that will be used. That's useful for mass importing of many rows with the same columns, but is surprising if you're trying to use a single `Upsert` object to add arbitrary data. For example, this won't work:
|
39
|
+
|
40
|
+
Upsert.stream(Pet.connection, Pet.table_name) do |upsert|
|
41
|
+
upsert.row({:name => 'Jerry'}, :breed => 'beagle')
|
42
|
+
upsert.row({:tag_number => 456}, :spiel => 'great cat') # won't work - doesn't use same columns
|
43
|
+
end
|
44
|
+
|
45
|
+
You would need to use a new `Upsert` object. On the other hand, this is totally fine:
|
46
|
+
|
47
|
+
Pet.upsert({:name => 'Jerry'}, :breed => 'beagle')
|
48
|
+
Pet.upsert({:tag_number => 456}, :spiel => 'great cat')
|
49
|
+
|
50
|
+
Please send in a pull request if you think there's a better way!
|
41
51
|
|
42
52
|
## Real-world usage
|
43
53
|
|
@@ -204,10 +214,6 @@ You could also use [activerecord-import](https://github.com/zdennis/activerecord
|
|
204
214
|
|
205
215
|
This, however, only works on MySQL and requires ActiveRecord—and if all you are doing is upserts, `upsert` is tested to be 40% faster. And you don't have to put all of the rows to be upserted into a single huge array - you can stream them using `Upsert.stream`.
|
206
216
|
|
207
|
-
### Loosely based on mongo-ruby-driver's upsert functionality
|
208
|
-
|
209
|
-
The `selector` and `document` arguments are inspired by the upsert functionality of the [mongo-ruby-driver's update method](http://api.mongodb.org/ruby/1.6.4/Mongo/Collection.html#update-instance_method).
|
210
|
-
|
211
217
|
## Copyright
|
212
218
|
|
213
219
|
Copyright 2012 Brighter Planet, Inc.
|
data/lib/upsert/mysql2_client.rb
CHANGED
@@ -70,8 +70,19 @@ class Upsert
|
|
70
70
|
end
|
71
71
|
|
72
72
|
def estimate_variable_sql_bytesize(take)
|
73
|
-
|
74
|
-
|
73
|
+
n = (take / 10.0).ceil
|
74
|
+
sample = if RUBY_VERSION >= '1.9'
|
75
|
+
rows.first(take).sample(n)
|
76
|
+
else
|
77
|
+
# based on https://github.com/marcandre/backports/blob/master/lib/backports/1.8.7/array.rb
|
78
|
+
memo = rows.first(take)
|
79
|
+
n.times do |i|
|
80
|
+
r = i + Kernel.rand(take - i)
|
81
|
+
memo[i], memo[r] = memo[r], memo[i]
|
82
|
+
end
|
83
|
+
memo.first(n)
|
84
|
+
end
|
85
|
+
10.0 * sample.inject(0) { |sum, row| sum + row.values_sql_bytesize + 3 }
|
75
86
|
end
|
76
87
|
|
77
88
|
def sql_bytesize(take)
|
data/lib/upsert/version.rb
CHANGED
data/test/helper.rb
CHANGED
@@ -50,9 +50,13 @@ MiniTest::Spec.class_eval do
|
|
50
50
|
end
|
51
51
|
2000.times do
|
52
52
|
selector = ActiveSupport::OrderedHash.new
|
53
|
-
selector[:name] =
|
53
|
+
selector[:name] = if RUBY_VERSION >= '1.9'
|
54
|
+
names.sample(1).first
|
55
|
+
else
|
56
|
+
names.choice
|
57
|
+
end
|
54
58
|
document = {
|
55
|
-
:lovability => BigDecimal.new(rand(1e11), 2),
|
59
|
+
:lovability => BigDecimal.new(rand(1e11).to_s, 2),
|
56
60
|
:tag_number => rand(1e8),
|
57
61
|
:spiel => SecureRandom.hex(rand(127)),
|
58
62
|
:good => true,
|
@@ -27,14 +27,16 @@ shared_examples_for "doesn't blow up on reserved words" do
|
|
27
27
|
nasty.auto_upgrade!
|
28
28
|
end
|
29
29
|
|
30
|
-
|
31
|
-
|
32
|
-
|
33
|
-
|
34
|
-
|
35
|
-
|
36
|
-
|
37
|
-
|
30
|
+
describe "reserved words" do
|
31
|
+
nasties.each do |nasty, words|
|
32
|
+
it "doesn't die on reserved words #{words.join(',')}" do
|
33
|
+
upsert = Upsert.new connection, nasty.table_name
|
34
|
+
random = rand(1e3).to_s
|
35
|
+
selector = { :fake_primary_key => random, words.first => words.first }
|
36
|
+
document = words[1..-1].inject({}) { |memo, word| memo[word] = word; memo }
|
37
|
+
assert_creates nasty, [selector.merge(document)] do
|
38
|
+
upsert.row selector, document
|
39
|
+
end
|
38
40
|
end
|
39
41
|
end
|
40
42
|
end
|
metadata
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: upsert
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.2.
|
4
|
+
version: 0.2.1
|
5
5
|
prerelease:
|
6
6
|
platform: ruby
|
7
7
|
authors:
|
@@ -182,6 +182,7 @@ extra_rdoc_files: []
|
|
182
182
|
files:
|
183
183
|
- .gitignore
|
184
184
|
- .yardopts
|
185
|
+
- CHANGELOG
|
185
186
|
- Gemfile
|
186
187
|
- LICENSE
|
187
188
|
- README.md
|