salesforce_chunker 1.2.0 → 1.2.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +7 -1
- data/Gemfile.lock +1 -1
- data/README.md +35 -3
- data/lib/salesforce_chunker/connection.rb +5 -4
- data/lib/salesforce_chunker/manual_chunking_breakpoint_query.rb +6 -4
- data/lib/salesforce_chunker/version.rb +1 -1
- metadata +17 -17
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 8c051e8de05fca9caf7049b418d7366752889787689fe22502f0eb74131cdbf1
|
4
|
+
data.tar.gz: 0df0dffe1900e42f584d3e7c067e2c643582707f6468d6f988743fb6466f506c
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 46ce51f398903355664620f9ee5274a28b5e74691d5ec5e1f5ff786410a530d547e07d7d35bfae77eb0de34ba690afc1a1732d2a8ec3eff81c87b954e4fe753b
|
7
|
+
data.tar.gz: bc1dc6ebb441f0a1b5e0253a8afd5b15a8d3440d8729228e0a62e136fbdfc1a22be5225908ec863461579709a826e16fa552ee7b3846e0ca4606844e749c6858
|
data/CHANGELOG.md
CHANGED
@@ -1,10 +1,16 @@
|
|
1
1
|
# CHANGELOG
|
2
2
|
|
3
|
+
## 1.2.1 - 2019-06-26
|
4
|
+
|
5
|
+
- Fixed bug in Manual Chunking that could result in larger batches.
|
6
|
+
- Added IOError to the types of errors that are retried.
|
7
|
+
- Removed circular reference and warning about it.
|
8
|
+
|
3
9
|
## 1.2.0 - 2019-06-14
|
4
10
|
|
5
11
|
- Added an include_deleted flag to perform a queryAll operation.
|
6
12
|
- Disabled explicit GZIP encoding to work with the latest versions of HTTParty.
|
7
|
-
- Added a retry for requests to recover from Net::ReadTimeout errors
|
13
|
+
- Added a retry for requests to recover from Net::ReadTimeout errors.
|
8
14
|
|
9
15
|
## 1.1.1 - 2018-11-26
|
10
16
|
|
data/Gemfile.lock
CHANGED
data/README.md
CHANGED
@@ -55,6 +55,9 @@ client = SalesforceChunker::Client.new(
|
|
55
55
|
| security_token | may be required depending on your Salesforce setup |
|
56
56
|
| domain | optional. defaults to `"login"`. |
|
57
57
|
| salesforce_version | optional. defaults to `"42.0"`. Must be >= `"33.0"` to use PK Chunking. |
|
58
|
+
| logger | optional. logger to use. Must be instance of or similar to rails logger. Use here if you want to log all API page requests. |
|
59
|
+
| log_output | optional. log output to use. i.e. `STDOUT`. |
|
60
|
+
|
58
61
|
|
59
62
|
#### Functions
|
60
63
|
|
@@ -63,6 +66,7 @@ client = SalesforceChunker::Client.new(
|
|
63
66
|
| query |
|
64
67
|
| single_batch_query | calls `query(job_type: "single_batch", **options)` |
|
65
68
|
| primary_key_chunking_query | calls `query(job_type: "primary_key_chunking", **options)` |
|
69
|
+
| manual_chunking_query | calls `query(job_type: "manual_chunking", **options)` |
|
66
70
|
|
67
71
|
#### Query
|
68
72
|
|
@@ -88,12 +92,12 @@ end
|
|
88
92
|
| --- | --- | --- |
|
89
93
|
| query | required | SOQL query. |
|
90
94
|
| object | required | Salesforce Object type. |
|
91
|
-
| batch_size | optional | defaults to `100000`. Number of records to process in a batch. (
|
95
|
+
| batch_size | optional | defaults to `100000`. Number of records to process in a batch. (Not used in Single Batch jobs) |
|
92
96
|
| retry_seconds | optional | defaults to `10`. Number of seconds to wait before querying API for updated results. |
|
93
|
-
| timeout_seconds | optional | defaults to `3600`. Number of seconds to wait before
|
97
|
+
| timeout_seconds | optional | defaults to `3600`. Number of seconds to wait for a batch to process before job is killed. |
|
94
98
|
| logger | optional | logger to use. Must be instance of or similar to rails logger. |
|
95
99
|
| log_output | optional | log output to use. i.e. `STDOUT`. |
|
96
|
-
| job_type | optional | defaults to `"primary_key_chunking"`. Can also be set to `"single_batch"`. |
|
100
|
+
| job_type | optional | defaults to `"primary_key_chunking"`. Can also be set to `"single_batch"` or `"manual_chunking`. |
|
97
101
|
| include_deleted | optional | defaults to `false`. Whether to include deleted records. |
|
98
102
|
|
99
103
|
`query` can either be called with a block, or will return an enumerator:
|
@@ -102,6 +106,34 @@ end
|
|
102
106
|
names = client.query(query, object, options).map { |result| result["Name"] }
|
103
107
|
```
|
104
108
|
|
109
|
+
### A discussion about Single Batch, Primary Key Chunking, and Manual Chunking job types.
|
110
|
+
|
111
|
+
One of the advantages of the Salesforce Bulk API over the other Salesforce APIs is the ability for Salesforce to process a number of requests (either queries or uploads) in parallel on their servers. The request chunks are referred to as batches.
|
112
|
+
|
113
|
+
#### Single Batch Query
|
114
|
+
|
115
|
+
In a single batch query, one SOQL statement is executed as a single batch. This works best if the total number of records to return is fewer than around 100,000 depending on memory usage and the number of fields being returned.
|
116
|
+
|
117
|
+
#### Primary Key Chunking Query
|
118
|
+
|
119
|
+
In Primary Key Chunking, the internal Salesforce PK chunking flag is used. Salesforce will create a number of batches automatically based on an internal Id index. See https://developer.salesforce.com/docs/atlas.en-us.api_asynch.meta/api_asynch/async_api_headers_enable_pk_chunking.htm
|
120
|
+
|
121
|
+
#### Manual Chunking Query
|
122
|
+
|
123
|
+
This approach is called "Manual" Chunking because it is our own implementation of PK Chunking in this gem. The gem downloads a CSV ordered list of all Ids it needs to download, and then uses this list to generate breakpoints that it uses to create batches.
|
124
|
+
|
125
|
+
#### Primary Key Chunking Query vs Manual Chunking Query
|
126
|
+
|
127
|
+
Advantages of Manual Chunking:
|
128
|
+
|
129
|
+
- Manual Chunking takes into account the where clause in the SOQL statement. For example, if you are filtering a small number of a large object count, say 250k out of 20M Objects, then Manual Chunking will split this into 3 batches of max 100k while PK chunking will split this into 200 batches, which will use up batches and API requests against your account and take a longer amount of time.
|
130
|
+
- Any object can use Manual Chunking (according to Salesforce, PK chunking is supported for the following objects: Account, Asset, Campaign, CampaignMember, Case, CaseHistory, Contact, Event, EventRelation, Lead, LoginHistory, Opportunity, Task, User, and custom objects.)
|
131
|
+
|
132
|
+
Advantages of Primary Key Chunking:
|
133
|
+
|
134
|
+
- Primary Key Chunking appears to be slightly faster, if using a PK Chunking eligible object and no where clause.
|
135
|
+
- Primary Key Chunking may be less buggy because many more people depend on the Salesforce API than this gem.
|
136
|
+
|
105
137
|
### Under the hood: SalesforceChunker::Job
|
106
138
|
|
107
139
|
Using `SalesforceChunker::Job`, you have more direct access to the Salesforce Bulk API functions, such as `create_batch`, `get_batch_statuses`, and `retrieve_batch_results`. This can be used to perform custom tasks, such as upserts or multiple batch queries.
|
@@ -32,7 +32,7 @@ module SalesforceChunker
|
|
32
32
|
|
33
33
|
def post(url, body, headers={})
|
34
34
|
@log.info "POST: #{url}"
|
35
|
-
response = self.class.retry_block(log: @log
|
35
|
+
response = self.class.retry_block(log: @log) do
|
36
36
|
HTTParty.post(@base_url + url, headers: @default_headers.merge(headers), body: body)
|
37
37
|
end
|
38
38
|
self.class.check_response_error(response.parsed_response)
|
@@ -40,7 +40,7 @@ module SalesforceChunker
|
|
40
40
|
|
41
41
|
def get_json(url, headers={})
|
42
42
|
@log.info "GET: #{url}"
|
43
|
-
response = self.class.retry_block(log: @log
|
43
|
+
response = self.class.retry_block(log: @log) do
|
44
44
|
HTTParty.get(@base_url + url, headers: @default_headers.merge(headers))
|
45
45
|
end
|
46
46
|
self.class.check_response_error(response.parsed_response)
|
@@ -48,7 +48,7 @@ module SalesforceChunker
|
|
48
48
|
|
49
49
|
def get(url, headers={})
|
50
50
|
@log.info "GET: #{url}"
|
51
|
-
self.class.retry_block(log: @log
|
51
|
+
self.class.retry_block(log: @log) do
|
52
52
|
HTTParty.get(@base_url + url, headers: @default_headers.merge(headers)).body
|
53
53
|
end
|
54
54
|
end
|
@@ -85,8 +85,9 @@ module SalesforceChunker
|
|
85
85
|
|
86
86
|
MAX_TRIES = 5
|
87
87
|
SLEEP_DURATION = 10
|
88
|
+
RESCUED_EXCEPTIONS = [Net::ReadTimeout, IOError]
|
88
89
|
|
89
|
-
def self.retry_block(log:
|
90
|
+
def self.retry_block(log: Logger.new(nil), tries: MAX_TRIES, sleep_duration: SLEEP_DURATION, rescues: RESCUED_EXCEPTIONS, &block)
|
90
91
|
attempt_number = 1
|
91
92
|
|
92
93
|
begin
|
@@ -12,18 +12,20 @@ module SalesforceChunker
|
|
12
12
|
end
|
13
13
|
|
14
14
|
def get_batch_results(batch_id)
|
15
|
-
retrieve_batch_results(batch_id).
|
15
|
+
retrieve_batch_results(batch_id).each_with_index do |result_id, result_index|
|
16
16
|
results = retrieve_raw_results(batch_id, result_id)
|
17
17
|
|
18
18
|
@log.info "Generating breakpoints from CSV results"
|
19
|
-
process_csv_results(results) { |result| yield result }
|
19
|
+
process_csv_results(results, result_index > 0) { |result| yield result }
|
20
20
|
end
|
21
21
|
end
|
22
22
|
|
23
|
-
def process_csv_results(
|
24
|
-
lines =
|
23
|
+
def process_csv_results(input, include_first_element)
|
24
|
+
lines = input.each_line
|
25
25
|
headers = lines.next
|
26
26
|
|
27
|
+
yield(lines.peek.chomp.gsub("\"", "")) if include_first_element
|
28
|
+
|
27
29
|
loop do
|
28
30
|
@batch_size.times { lines.next }
|
29
31
|
yield(lines.peek.chomp.gsub("\"", ""))
|
metadata
CHANGED
@@ -1,94 +1,94 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: salesforce_chunker
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 1.2.
|
4
|
+
version: 1.2.1
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Curtis Holmes
|
8
|
-
autorequire:
|
8
|
+
autorequire:
|
9
9
|
bindir: exe
|
10
10
|
cert_chain: []
|
11
|
-
date: 2019-06-
|
11
|
+
date: 2019-06-26 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
|
-
name: httparty
|
15
14
|
requirement: !ruby/object:Gem::Requirement
|
16
15
|
requirements:
|
17
16
|
- - "~>"
|
18
17
|
- !ruby/object:Gem::Version
|
19
18
|
version: '0.15'
|
20
|
-
|
19
|
+
name: httparty
|
21
20
|
prerelease: false
|
21
|
+
type: :runtime
|
22
22
|
version_requirements: !ruby/object:Gem::Requirement
|
23
23
|
requirements:
|
24
24
|
- - "~>"
|
25
25
|
- !ruby/object:Gem::Version
|
26
26
|
version: '0.15'
|
27
27
|
- !ruby/object:Gem::Dependency
|
28
|
-
name: bundler
|
29
28
|
requirement: !ruby/object:Gem::Requirement
|
30
29
|
requirements:
|
31
30
|
- - "~>"
|
32
31
|
- !ruby/object:Gem::Version
|
33
32
|
version: '1.16'
|
34
|
-
|
33
|
+
name: bundler
|
35
34
|
prerelease: false
|
35
|
+
type: :development
|
36
36
|
version_requirements: !ruby/object:Gem::Requirement
|
37
37
|
requirements:
|
38
38
|
- - "~>"
|
39
39
|
- !ruby/object:Gem::Version
|
40
40
|
version: '1.16'
|
41
41
|
- !ruby/object:Gem::Dependency
|
42
|
-
name: rake
|
43
42
|
requirement: !ruby/object:Gem::Requirement
|
44
43
|
requirements:
|
45
44
|
- - "~>"
|
46
45
|
- !ruby/object:Gem::Version
|
47
46
|
version: '10.0'
|
48
|
-
|
47
|
+
name: rake
|
49
48
|
prerelease: false
|
49
|
+
type: :development
|
50
50
|
version_requirements: !ruby/object:Gem::Requirement
|
51
51
|
requirements:
|
52
52
|
- - "~>"
|
53
53
|
- !ruby/object:Gem::Version
|
54
54
|
version: '10.0'
|
55
55
|
- !ruby/object:Gem::Dependency
|
56
|
-
name: minitest
|
57
56
|
requirement: !ruby/object:Gem::Requirement
|
58
57
|
requirements:
|
59
58
|
- - "~>"
|
60
59
|
- !ruby/object:Gem::Version
|
61
60
|
version: '5.0'
|
62
|
-
|
61
|
+
name: minitest
|
63
62
|
prerelease: false
|
63
|
+
type: :development
|
64
64
|
version_requirements: !ruby/object:Gem::Requirement
|
65
65
|
requirements:
|
66
66
|
- - "~>"
|
67
67
|
- !ruby/object:Gem::Version
|
68
68
|
version: '5.0'
|
69
69
|
- !ruby/object:Gem::Dependency
|
70
|
-
name: mocha
|
71
70
|
requirement: !ruby/object:Gem::Requirement
|
72
71
|
requirements:
|
73
72
|
- - "~>"
|
74
73
|
- !ruby/object:Gem::Version
|
75
74
|
version: '1.5'
|
76
|
-
|
75
|
+
name: mocha
|
77
76
|
prerelease: false
|
77
|
+
type: :development
|
78
78
|
version_requirements: !ruby/object:Gem::Requirement
|
79
79
|
requirements:
|
80
80
|
- - "~>"
|
81
81
|
- !ruby/object:Gem::Version
|
82
82
|
version: '1.5'
|
83
83
|
- !ruby/object:Gem::Dependency
|
84
|
-
name: pry
|
85
84
|
requirement: !ruby/object:Gem::Requirement
|
86
85
|
requirements:
|
87
86
|
- - "~>"
|
88
87
|
- !ruby/object:Gem::Version
|
89
88
|
version: '0.11'
|
90
|
-
|
89
|
+
name: pry
|
91
90
|
prerelease: false
|
91
|
+
type: :development
|
92
92
|
version_requirements: !ruby/object:Gem::Requirement
|
93
93
|
requirements:
|
94
94
|
- - "~>"
|
@@ -127,7 +127,7 @@ homepage: https://github.com/Shopify/salesforce_chunker
|
|
127
127
|
licenses:
|
128
128
|
- MIT
|
129
129
|
metadata: {}
|
130
|
-
post_install_message:
|
130
|
+
post_install_message:
|
131
131
|
rdoc_options: []
|
132
132
|
require_paths:
|
133
133
|
- lib
|
@@ -143,7 +143,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
|
|
143
143
|
version: '0'
|
144
144
|
requirements: []
|
145
145
|
rubygems_version: 3.0.2
|
146
|
-
signing_key:
|
146
|
+
signing_key:
|
147
147
|
specification_version: 4
|
148
148
|
summary: Salesforce Bulk API Client
|
149
149
|
test_files: []
|