salesforce_chunker 1.2.0 → 1.2.1
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/CHANGELOG.md +7 -1
- data/Gemfile.lock +1 -1
- data/README.md +35 -3
- data/lib/salesforce_chunker/connection.rb +5 -4
- data/lib/salesforce_chunker/manual_chunking_breakpoint_query.rb +6 -4
- data/lib/salesforce_chunker/version.rb +1 -1
- metadata +17 -17
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 8c051e8de05fca9caf7049b418d7366752889787689fe22502f0eb74131cdbf1
|
4
|
+
data.tar.gz: 0df0dffe1900e42f584d3e7c067e2c643582707f6468d6f988743fb6466f506c
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 46ce51f398903355664620f9ee5274a28b5e74691d5ec5e1f5ff786410a530d547e07d7d35bfae77eb0de34ba690afc1a1732d2a8ec3eff81c87b954e4fe753b
|
7
|
+
data.tar.gz: bc1dc6ebb441f0a1b5e0253a8afd5b15a8d3440d8729228e0a62e136fbdfc1a22be5225908ec863461579709a826e16fa552ee7b3846e0ca4606844e749c6858
|
data/CHANGELOG.md
CHANGED
@@ -1,10 +1,16 @@
|
|
1
1
|
# CHANGELOG
|
2
2
|
|
3
|
+
## 1.2.1 - 2019-06-26
|
4
|
+
|
5
|
+
- Fixed bug in Manual Chunking that could result in larger batches.
|
6
|
+
- Added IOError to the types of errors that are retried.
|
7
|
+
- Removed circular reference and warning about it.
|
8
|
+
|
3
9
|
## 1.2.0 - 2019-06-14
|
4
10
|
|
5
11
|
- Added an include_deleted flag to perform a queryAll operation.
|
6
12
|
- Disabled explicit GZIP encoding to work with the latest versions of HTTParty.
|
7
|
-
- Added a retry for requests to recover from Net::ReadTimeout errors
|
13
|
+
- Added a retry for requests to recover from Net::ReadTimeout errors.
|
8
14
|
|
9
15
|
## 1.1.1 - 2018-11-26
|
10
16
|
|
data/Gemfile.lock
CHANGED
data/README.md
CHANGED
@@ -55,6 +55,9 @@ client = SalesforceChunker::Client.new(
|
|
55
55
|
| security_token | may be required depending on your Salesforce setup |
|
56
56
|
| domain | optional. defaults to `"login"`. |
|
57
57
|
| salesforce_version | optional. defaults to `"42.0"`. Must be >= `"33.0"` to use PK Chunking. |
|
58
|
+
| logger | optional. logger to use. Must be instance of or similar to rails logger. Use here if you want to log all API page requests. |
|
59
|
+
| log_output | optional. log output to use. i.e. `STDOUT`. |
|
60
|
+
|
58
61
|
|
59
62
|
#### Functions
|
60
63
|
|
@@ -63,6 +66,7 @@ client = SalesforceChunker::Client.new(
|
|
63
66
|
| query |
|
64
67
|
| single_batch_query | calls `query(job_type: "single_batch", **options)` |
|
65
68
|
| primary_key_chunking_query | calls `query(job_type: "primary_key_chunking", **options)` |
|
69
|
+
| manual_chunking_query | calls `query(job_type: "manual_chunking", **options)` |
|
66
70
|
|
67
71
|
#### Query
|
68
72
|
|
@@ -88,12 +92,12 @@ end
|
|
88
92
|
| --- | --- | --- |
|
89
93
|
| query | required | SOQL query. |
|
90
94
|
| object | required | Salesforce Object type. |
|
91
|
-
| batch_size | optional | defaults to `100000`. Number of records to process in a batch. (
|
95
|
+
| batch_size | optional | defaults to `100000`. Number of records to process in a batch. (Not used in Single Batch jobs) |
|
92
96
|
| retry_seconds | optional | defaults to `10`. Number of seconds to wait before querying API for updated results. |
|
93
|
-
| timeout_seconds | optional | defaults to `3600`. Number of seconds to wait before
|
97
|
+
| timeout_seconds | optional | defaults to `3600`. Number of seconds to wait for a batch to process before job is killed. |
|
94
98
|
| logger | optional | logger to use. Must be instance of or similar to rails logger. |
|
95
99
|
| log_output | optional | log output to use. i.e. `STDOUT`. |
|
96
|
-
| job_type | optional | defaults to `"primary_key_chunking"`. Can also be set to `"single_batch"`. |
|
100
|
+
| job_type | optional | defaults to `"primary_key_chunking"`. Can also be set to `"single_batch"` or `"manual_chunking`. |
|
97
101
|
| include_deleted | optional | defaults to `false`. Whether to include deleted records. |
|
98
102
|
|
99
103
|
`query` can either be called with a block, or will return an enumerator:
|
@@ -102,6 +106,34 @@ end
|
|
102
106
|
names = client.query(query, object, options).map { |result| result["Name"] }
|
103
107
|
```
|
104
108
|
|
109
|
+
### A discussion about Single Batch, Primary Key Chunking, and Manual Chunking job types.
|
110
|
+
|
111
|
+
One of the advantages of the Salesforce Bulk API over the other Salesforce APIs is the ability for Salesforce to process a number of requests (either queries or uploads) in parallel on their servers. The request chunks are referred to as batches.
|
112
|
+
|
113
|
+
#### Single Batch Query
|
114
|
+
|
115
|
+
In a single batch query, one SOQL statement is executed as a single batch. This works best if the total number of records to return is fewer than around 100,000 depending on memory usage and the number of fields being returned.
|
116
|
+
|
117
|
+
#### Primary Key Chunking Query
|
118
|
+
|
119
|
+
In Primary Key Chunking, the internal Salesforce PK chunking flag is used. Salesforce will create a number of batches automatically based on an internal Id index. See https://developer.salesforce.com/docs/atlas.en-us.api_asynch.meta/api_asynch/async_api_headers_enable_pk_chunking.htm
|
120
|
+
|
121
|
+
#### Manual Chunking Query
|
122
|
+
|
123
|
+
This approach is called "Manual" Chunking because it is our own implementation of PK Chunking in this gem. The gem downloads a CSV ordered list of all Ids it needs to download, and then uses this list to generate breakpoints that it uses to create batches.
|
124
|
+
|
125
|
+
#### Primary Key Chunking Query vs Manual Chunking Query
|
126
|
+
|
127
|
+
Advantages of Manual Chunking:
|
128
|
+
|
129
|
+
- Manual Chunking takes into account the where clause in the SOQL statement. For example, if you are filtering a small number of a large object count, say 250k out of 20M Objects, then Manual Chunking will split this into 3 batches of max 100k while PK chunking will split this into 200 batches, which will use up batches and API requests against your account and take a longer amount of time.
|
130
|
+
- Any object can use Manual Chunking (according to Salesforce, PK chunking is supported for the following objects: Account, Asset, Campaign, CampaignMember, Case, CaseHistory, Contact, Event, EventRelation, Lead, LoginHistory, Opportunity, Task, User, and custom objects.)
|
131
|
+
|
132
|
+
Advantages of Primary Key Chunking:
|
133
|
+
|
134
|
+
- Primary Key Chunking appears to be slightly faster, if using a PK Chunking eligible object and no where clause.
|
135
|
+
- Primary Key Chunking may be less buggy because many more people depend on the Salesforce API than this gem.
|
136
|
+
|
105
137
|
### Under the hood: SalesforceChunker::Job
|
106
138
|
|
107
139
|
Using `SalesforceChunker::Job`, you have more direct access to the Salesforce Bulk API functions, such as `create_batch`, `get_batch_statuses`, and `retrieve_batch_results`. This can be used to perform custom tasks, such as upserts or multiple batch queries.
|
@@ -32,7 +32,7 @@ module SalesforceChunker
|
|
32
32
|
|
33
33
|
def post(url, body, headers={})
|
34
34
|
@log.info "POST: #{url}"
|
35
|
-
response = self.class.retry_block(log: @log
|
35
|
+
response = self.class.retry_block(log: @log) do
|
36
36
|
HTTParty.post(@base_url + url, headers: @default_headers.merge(headers), body: body)
|
37
37
|
end
|
38
38
|
self.class.check_response_error(response.parsed_response)
|
@@ -40,7 +40,7 @@ module SalesforceChunker
|
|
40
40
|
|
41
41
|
def get_json(url, headers={})
|
42
42
|
@log.info "GET: #{url}"
|
43
|
-
response = self.class.retry_block(log: @log
|
43
|
+
response = self.class.retry_block(log: @log) do
|
44
44
|
HTTParty.get(@base_url + url, headers: @default_headers.merge(headers))
|
45
45
|
end
|
46
46
|
self.class.check_response_error(response.parsed_response)
|
@@ -48,7 +48,7 @@ module SalesforceChunker
|
|
48
48
|
|
49
49
|
def get(url, headers={})
|
50
50
|
@log.info "GET: #{url}"
|
51
|
-
self.class.retry_block(log: @log
|
51
|
+
self.class.retry_block(log: @log) do
|
52
52
|
HTTParty.get(@base_url + url, headers: @default_headers.merge(headers)).body
|
53
53
|
end
|
54
54
|
end
|
@@ -85,8 +85,9 @@ module SalesforceChunker
|
|
85
85
|
|
86
86
|
MAX_TRIES = 5
|
87
87
|
SLEEP_DURATION = 10
|
88
|
+
RESCUED_EXCEPTIONS = [Net::ReadTimeout, IOError]
|
88
89
|
|
89
|
-
def self.retry_block(log:
|
90
|
+
def self.retry_block(log: Logger.new(nil), tries: MAX_TRIES, sleep_duration: SLEEP_DURATION, rescues: RESCUED_EXCEPTIONS, &block)
|
90
91
|
attempt_number = 1
|
91
92
|
|
92
93
|
begin
|
@@ -12,18 +12,20 @@ module SalesforceChunker
|
|
12
12
|
end
|
13
13
|
|
14
14
|
def get_batch_results(batch_id)
|
15
|
-
retrieve_batch_results(batch_id).
|
15
|
+
retrieve_batch_results(batch_id).each_with_index do |result_id, result_index|
|
16
16
|
results = retrieve_raw_results(batch_id, result_id)
|
17
17
|
|
18
18
|
@log.info "Generating breakpoints from CSV results"
|
19
|
-
process_csv_results(results) { |result| yield result }
|
19
|
+
process_csv_results(results, result_index > 0) { |result| yield result }
|
20
20
|
end
|
21
21
|
end
|
22
22
|
|
23
|
-
def process_csv_results(
|
24
|
-
lines =
|
23
|
+
def process_csv_results(input, include_first_element)
|
24
|
+
lines = input.each_line
|
25
25
|
headers = lines.next
|
26
26
|
|
27
|
+
yield(lines.peek.chomp.gsub("\"", "")) if include_first_element
|
28
|
+
|
27
29
|
loop do
|
28
30
|
@batch_size.times { lines.next }
|
29
31
|
yield(lines.peek.chomp.gsub("\"", ""))
|
metadata
CHANGED
@@ -1,94 +1,94 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: salesforce_chunker
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 1.2.
|
4
|
+
version: 1.2.1
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Curtis Holmes
|
8
|
-
autorequire:
|
8
|
+
autorequire:
|
9
9
|
bindir: exe
|
10
10
|
cert_chain: []
|
11
|
-
date: 2019-06-
|
11
|
+
date: 2019-06-26 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
|
-
name: httparty
|
15
14
|
requirement: !ruby/object:Gem::Requirement
|
16
15
|
requirements:
|
17
16
|
- - "~>"
|
18
17
|
- !ruby/object:Gem::Version
|
19
18
|
version: '0.15'
|
20
|
-
|
19
|
+
name: httparty
|
21
20
|
prerelease: false
|
21
|
+
type: :runtime
|
22
22
|
version_requirements: !ruby/object:Gem::Requirement
|
23
23
|
requirements:
|
24
24
|
- - "~>"
|
25
25
|
- !ruby/object:Gem::Version
|
26
26
|
version: '0.15'
|
27
27
|
- !ruby/object:Gem::Dependency
|
28
|
-
name: bundler
|
29
28
|
requirement: !ruby/object:Gem::Requirement
|
30
29
|
requirements:
|
31
30
|
- - "~>"
|
32
31
|
- !ruby/object:Gem::Version
|
33
32
|
version: '1.16'
|
34
|
-
|
33
|
+
name: bundler
|
35
34
|
prerelease: false
|
35
|
+
type: :development
|
36
36
|
version_requirements: !ruby/object:Gem::Requirement
|
37
37
|
requirements:
|
38
38
|
- - "~>"
|
39
39
|
- !ruby/object:Gem::Version
|
40
40
|
version: '1.16'
|
41
41
|
- !ruby/object:Gem::Dependency
|
42
|
-
name: rake
|
43
42
|
requirement: !ruby/object:Gem::Requirement
|
44
43
|
requirements:
|
45
44
|
- - "~>"
|
46
45
|
- !ruby/object:Gem::Version
|
47
46
|
version: '10.0'
|
48
|
-
|
47
|
+
name: rake
|
49
48
|
prerelease: false
|
49
|
+
type: :development
|
50
50
|
version_requirements: !ruby/object:Gem::Requirement
|
51
51
|
requirements:
|
52
52
|
- - "~>"
|
53
53
|
- !ruby/object:Gem::Version
|
54
54
|
version: '10.0'
|
55
55
|
- !ruby/object:Gem::Dependency
|
56
|
-
name: minitest
|
57
56
|
requirement: !ruby/object:Gem::Requirement
|
58
57
|
requirements:
|
59
58
|
- - "~>"
|
60
59
|
- !ruby/object:Gem::Version
|
61
60
|
version: '5.0'
|
62
|
-
|
61
|
+
name: minitest
|
63
62
|
prerelease: false
|
63
|
+
type: :development
|
64
64
|
version_requirements: !ruby/object:Gem::Requirement
|
65
65
|
requirements:
|
66
66
|
- - "~>"
|
67
67
|
- !ruby/object:Gem::Version
|
68
68
|
version: '5.0'
|
69
69
|
- !ruby/object:Gem::Dependency
|
70
|
-
name: mocha
|
71
70
|
requirement: !ruby/object:Gem::Requirement
|
72
71
|
requirements:
|
73
72
|
- - "~>"
|
74
73
|
- !ruby/object:Gem::Version
|
75
74
|
version: '1.5'
|
76
|
-
|
75
|
+
name: mocha
|
77
76
|
prerelease: false
|
77
|
+
type: :development
|
78
78
|
version_requirements: !ruby/object:Gem::Requirement
|
79
79
|
requirements:
|
80
80
|
- - "~>"
|
81
81
|
- !ruby/object:Gem::Version
|
82
82
|
version: '1.5'
|
83
83
|
- !ruby/object:Gem::Dependency
|
84
|
-
name: pry
|
85
84
|
requirement: !ruby/object:Gem::Requirement
|
86
85
|
requirements:
|
87
86
|
- - "~>"
|
88
87
|
- !ruby/object:Gem::Version
|
89
88
|
version: '0.11'
|
90
|
-
|
89
|
+
name: pry
|
91
90
|
prerelease: false
|
91
|
+
type: :development
|
92
92
|
version_requirements: !ruby/object:Gem::Requirement
|
93
93
|
requirements:
|
94
94
|
- - "~>"
|
@@ -127,7 +127,7 @@ homepage: https://github.com/Shopify/salesforce_chunker
|
|
127
127
|
licenses:
|
128
128
|
- MIT
|
129
129
|
metadata: {}
|
130
|
-
post_install_message:
|
130
|
+
post_install_message:
|
131
131
|
rdoc_options: []
|
132
132
|
require_paths:
|
133
133
|
- lib
|
@@ -143,7 +143,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
|
|
143
143
|
version: '0'
|
144
144
|
requirements: []
|
145
145
|
rubygems_version: 3.0.2
|
146
|
-
signing_key:
|
146
|
+
signing_key:
|
147
147
|
specification_version: 4
|
148
148
|
summary: Salesforce Bulk API Client
|
149
149
|
test_files: []
|