salesforce_chunker 1.2.0 → 1.2.1

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: bce2e320c05d4d8a6cd855eeedc8eac3bb2fd02a0a2d90be3743d156fc326127
4
- data.tar.gz: 88e0b977efdf78f298adceb9a09b5d31b67ecd59e58884e9ac2d649de15ac30b
3
+ metadata.gz: 8c051e8de05fca9caf7049b418d7366752889787689fe22502f0eb74131cdbf1
4
+ data.tar.gz: 0df0dffe1900e42f584d3e7c067e2c643582707f6468d6f988743fb6466f506c
5
5
  SHA512:
6
- metadata.gz: 3294b08e552ac9dad461f13c9678ee48823f8fc3a708db6acabbf6574ad6f0715bcb0947fee78836d9f0d212ec713b555ab017268664d25ee40c1e75a78f939a
7
- data.tar.gz: addb0a2b786fc330f1366582e616bf9e63677fe6664873d14189f923c9a96527ee50a17685aab2b8f021b548cbb0afaa20f59eaf6b85a7785633a4f62b5d599f
6
+ metadata.gz: 46ce51f398903355664620f9ee5274a28b5e74691d5ec5e1f5ff786410a530d547e07d7d35bfae77eb0de34ba690afc1a1732d2a8ec3eff81c87b954e4fe753b
7
+ data.tar.gz: bc1dc6ebb441f0a1b5e0253a8afd5b15a8d3440d8729228e0a62e136fbdfc1a22be5225908ec863461579709a826e16fa552ee7b3846e0ca4606844e749c6858
data/CHANGELOG.md CHANGED
@@ -1,10 +1,16 @@
1
1
  # CHANGELOG
2
2
 
3
+ ## 1.2.1 - 2019-06-26
4
+
5
+ - Fixed bug in Manual Chunking that could result in larger batches.
6
+ - Added IOError to the types of errors that are retried.
7
+ - Removed circular reference and warning about it.
8
+
3
9
  ## 1.2.0 - 2019-06-14
4
10
 
5
11
  - Added an include_deleted flag to perform a queryAll operation.
6
12
  - Disabled explicit GZIP encoding to work with the latest versions of HTTParty.
7
- - Added a retry for requests to recover from Net::ReadTimeout errors
13
+ - Added a retry for requests to recover from Net::ReadTimeout errors.
8
14
 
9
15
  ## 1.1.1 - 2018-11-26
10
16
 
data/Gemfile.lock CHANGED
@@ -1,7 +1,7 @@
1
1
  PATH
2
2
  remote: .
3
3
  specs:
4
- salesforce_chunker (1.2.0)
4
+ salesforce_chunker (1.2.1)
5
5
  httparty (~> 0.15)
6
6
 
7
7
  GEM
data/README.md CHANGED
@@ -55,6 +55,9 @@ client = SalesforceChunker::Client.new(
55
55
  | security_token | may be required depending on your Salesforce setup |
56
56
  | domain | optional. defaults to `"login"`. |
57
57
  | salesforce_version | optional. defaults to `"42.0"`. Must be >= `"33.0"` to use PK Chunking. |
58
+ | logger | optional. logger to use. Must be instance of or similar to rails logger. Use here if you want to log all API page requests. |
59
+ | log_output | optional. log output to use. i.e. `STDOUT`. |
60
+
58
61
 
59
62
  #### Functions
60
63
 
@@ -63,6 +66,7 @@ client = SalesforceChunker::Client.new(
63
66
  | query |
64
67
  | single_batch_query | calls `query(job_type: "single_batch", **options)` |
65
68
  | primary_key_chunking_query | calls `query(job_type: "primary_key_chunking", **options)` |
69
+ | manual_chunking_query | calls `query(job_type: "manual_chunking", **options)` |
66
70
 
67
71
  #### Query
68
72
 
@@ -88,12 +92,12 @@ end
88
92
  | --- | --- | --- |
89
93
  | query | required | SOQL query. |
90
94
  | object | required | Salesforce Object type. |
91
- | batch_size | optional | defaults to `100000`. Number of records to process in a batch. (Only for PK Chunking) |
95
+ | batch_size | optional | defaults to `100000`. Number of records to process in a batch. (Not used in Single Batch jobs) |
92
96
  | retry_seconds | optional | defaults to `10`. Number of seconds to wait before querying API for updated results. |
93
- | timeout_seconds | optional | defaults to `3600`. Number of seconds to wait before query is killed. |
97
+ | timeout_seconds | optional | defaults to `3600`. Number of seconds to wait for a batch to process before job is killed. |
94
98
  | logger | optional | logger to use. Must be instance of or similar to rails logger. |
95
99
  | log_output | optional | log output to use. i.e. `STDOUT`. |
96
- | job_type | optional | defaults to `"primary_key_chunking"`. Can also be set to `"single_batch"`. |
100
+ | job_type | optional | defaults to `"primary_key_chunking"`. Can also be set to `"single_batch"` or `"manual_chunking`. |
97
101
  | include_deleted | optional | defaults to `false`. Whether to include deleted records. |
98
102
 
99
103
  `query` can either be called with a block, or will return an enumerator:
@@ -102,6 +106,34 @@ end
102
106
  names = client.query(query, object, options).map { |result| result["Name"] }
103
107
  ```
104
108
 
109
+ ### A discussion about Single Batch, Primary Key Chunking, and Manual Chunking job types.
110
+
111
+ One of the advantages of the Salesforce Bulk API over the other Salesforce APIs is the ability for Salesforce to process a number of requests (either queries or uploads) in parallel on their servers. The request chunks are referred to as batches.
112
+
113
+ #### Single Batch Query
114
+
115
+ In a single batch query, one SOQL statement is executed as a single batch. This works best if the total number of records to return is fewer than around 100,000 depending on memory usage and the number of fields being returned.
116
+
117
+ #### Primary Key Chunking Query
118
+
119
+ In Primary Key Chunking, the internal Salesforce PK chunking flag is used. Salesforce will create a number of batches automatically based on an internal Id index. See https://developer.salesforce.com/docs/atlas.en-us.api_asynch.meta/api_asynch/async_api_headers_enable_pk_chunking.htm
120
+
121
+ #### Manual Chunking Query
122
+
123
+ This approach is called "Manual" Chunking because it is our own implementation of PK Chunking in this gem. The gem downloads a CSV ordered list of all Ids it needs to download, and then uses this list to generate breakpoints that it uses to create batches.
124
+
125
+ #### Primary Key Chunking Query vs Manual Chunking Query
126
+
127
+ Advantages of Manual Chunking:
128
+
129
+ - Manual Chunking takes into account the where clause in the SOQL statement. For example, if you are filtering a small number of a large object count, say 250k out of 20M Objects, then Manual Chunking will split this into 3 batches of max 100k while PK chunking will split this into 200 batches, which will use up batches and API requests against your account and take a longer amount of time.
130
+ - Any object can use Manual Chunking (according to Salesforce, PK chunking is supported for the following objects: Account, Asset, Campaign, CampaignMember, Case, CaseHistory, Contact, Event, EventRelation, Lead, LoginHistory, Opportunity, Task, User, and custom objects.)
131
+
132
+ Advantages of Primary Key Chunking:
133
+
134
+ - Primary Key Chunking appears to be slightly faster, if using a PK Chunking eligible object and no where clause.
135
+ - Primary Key Chunking may be less buggy because many more people depend on the Salesforce API than this gem.
136
+
105
137
  ### Under the hood: SalesforceChunker::Job
106
138
 
107
139
  Using `SalesforceChunker::Job`, you have more direct access to the Salesforce Bulk API functions, such as `create_batch`, `get_batch_statuses`, and `retrieve_batch_results`. This can be used to perform custom tasks, such as upserts or multiple batch queries.
@@ -32,7 +32,7 @@ module SalesforceChunker
32
32
 
33
33
  def post(url, body, headers={})
34
34
  @log.info "POST: #{url}"
35
- response = self.class.retry_block(log: @log, rescues: Net::ReadTimeout) do
35
+ response = self.class.retry_block(log: @log) do
36
36
  HTTParty.post(@base_url + url, headers: @default_headers.merge(headers), body: body)
37
37
  end
38
38
  self.class.check_response_error(response.parsed_response)
@@ -40,7 +40,7 @@ module SalesforceChunker
40
40
 
41
41
  def get_json(url, headers={})
42
42
  @log.info "GET: #{url}"
43
- response = self.class.retry_block(log: @log, rescues: Net::ReadTimeout) do
43
+ response = self.class.retry_block(log: @log) do
44
44
  HTTParty.get(@base_url + url, headers: @default_headers.merge(headers))
45
45
  end
46
46
  self.class.check_response_error(response.parsed_response)
@@ -48,7 +48,7 @@ module SalesforceChunker
48
48
 
49
49
  def get(url, headers={})
50
50
  @log.info "GET: #{url}"
51
- self.class.retry_block(log: @log, rescues: Net::ReadTimeout) do
51
+ self.class.retry_block(log: @log) do
52
52
  HTTParty.get(@base_url + url, headers: @default_headers.merge(headers)).body
53
53
  end
54
54
  end
@@ -85,8 +85,9 @@ module SalesforceChunker
85
85
 
86
86
  MAX_TRIES = 5
87
87
  SLEEP_DURATION = 10
88
+ RESCUED_EXCEPTIONS = [Net::ReadTimeout, IOError]
88
89
 
89
- def self.retry_block(log: log, tries: MAX_TRIES, sleep_duration: SLEEP_DURATION, rescues:, &block)
90
+ def self.retry_block(log: Logger.new(nil), tries: MAX_TRIES, sleep_duration: SLEEP_DURATION, rescues: RESCUED_EXCEPTIONS, &block)
90
91
  attempt_number = 1
91
92
 
92
93
  begin
@@ -12,18 +12,20 @@ module SalesforceChunker
12
12
  end
13
13
 
14
14
  def get_batch_results(batch_id)
15
- retrieve_batch_results(batch_id).each do |result_id|
15
+ retrieve_batch_results(batch_id).each_with_index do |result_id, result_index|
16
16
  results = retrieve_raw_results(batch_id, result_id)
17
17
 
18
18
  @log.info "Generating breakpoints from CSV results"
19
- process_csv_results(results) { |result| yield result }
19
+ process_csv_results(results, result_index > 0) { |result| yield result }
20
20
  end
21
21
  end
22
22
 
23
- def process_csv_results(result)
24
- lines = result.each_line
23
+ def process_csv_results(input, include_first_element)
24
+ lines = input.each_line
25
25
  headers = lines.next
26
26
 
27
+ yield(lines.peek.chomp.gsub("\"", "")) if include_first_element
28
+
27
29
  loop do
28
30
  @batch_size.times { lines.next }
29
31
  yield(lines.peek.chomp.gsub("\"", ""))
@@ -1,3 +1,3 @@
1
1
  module SalesforceChunker
2
- VERSION = "1.2.0"
2
+ VERSION = "1.2.1"
3
3
  end
metadata CHANGED
@@ -1,94 +1,94 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: salesforce_chunker
3
3
  version: !ruby/object:Gem::Version
4
- version: 1.2.0
4
+ version: 1.2.1
5
5
  platform: ruby
6
6
  authors:
7
7
  - Curtis Holmes
8
- autorequire:
8
+ autorequire:
9
9
  bindir: exe
10
10
  cert_chain: []
11
- date: 2019-06-14 00:00:00.000000000 Z
11
+ date: 2019-06-26 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
- name: httparty
15
14
  requirement: !ruby/object:Gem::Requirement
16
15
  requirements:
17
16
  - - "~>"
18
17
  - !ruby/object:Gem::Version
19
18
  version: '0.15'
20
- type: :runtime
19
+ name: httparty
21
20
  prerelease: false
21
+ type: :runtime
22
22
  version_requirements: !ruby/object:Gem::Requirement
23
23
  requirements:
24
24
  - - "~>"
25
25
  - !ruby/object:Gem::Version
26
26
  version: '0.15'
27
27
  - !ruby/object:Gem::Dependency
28
- name: bundler
29
28
  requirement: !ruby/object:Gem::Requirement
30
29
  requirements:
31
30
  - - "~>"
32
31
  - !ruby/object:Gem::Version
33
32
  version: '1.16'
34
- type: :development
33
+ name: bundler
35
34
  prerelease: false
35
+ type: :development
36
36
  version_requirements: !ruby/object:Gem::Requirement
37
37
  requirements:
38
38
  - - "~>"
39
39
  - !ruby/object:Gem::Version
40
40
  version: '1.16'
41
41
  - !ruby/object:Gem::Dependency
42
- name: rake
43
42
  requirement: !ruby/object:Gem::Requirement
44
43
  requirements:
45
44
  - - "~>"
46
45
  - !ruby/object:Gem::Version
47
46
  version: '10.0'
48
- type: :development
47
+ name: rake
49
48
  prerelease: false
49
+ type: :development
50
50
  version_requirements: !ruby/object:Gem::Requirement
51
51
  requirements:
52
52
  - - "~>"
53
53
  - !ruby/object:Gem::Version
54
54
  version: '10.0'
55
55
  - !ruby/object:Gem::Dependency
56
- name: minitest
57
56
  requirement: !ruby/object:Gem::Requirement
58
57
  requirements:
59
58
  - - "~>"
60
59
  - !ruby/object:Gem::Version
61
60
  version: '5.0'
62
- type: :development
61
+ name: minitest
63
62
  prerelease: false
63
+ type: :development
64
64
  version_requirements: !ruby/object:Gem::Requirement
65
65
  requirements:
66
66
  - - "~>"
67
67
  - !ruby/object:Gem::Version
68
68
  version: '5.0'
69
69
  - !ruby/object:Gem::Dependency
70
- name: mocha
71
70
  requirement: !ruby/object:Gem::Requirement
72
71
  requirements:
73
72
  - - "~>"
74
73
  - !ruby/object:Gem::Version
75
74
  version: '1.5'
76
- type: :development
75
+ name: mocha
77
76
  prerelease: false
77
+ type: :development
78
78
  version_requirements: !ruby/object:Gem::Requirement
79
79
  requirements:
80
80
  - - "~>"
81
81
  - !ruby/object:Gem::Version
82
82
  version: '1.5'
83
83
  - !ruby/object:Gem::Dependency
84
- name: pry
85
84
  requirement: !ruby/object:Gem::Requirement
86
85
  requirements:
87
86
  - - "~>"
88
87
  - !ruby/object:Gem::Version
89
88
  version: '0.11'
90
- type: :development
89
+ name: pry
91
90
  prerelease: false
91
+ type: :development
92
92
  version_requirements: !ruby/object:Gem::Requirement
93
93
  requirements:
94
94
  - - "~>"
@@ -127,7 +127,7 @@ homepage: https://github.com/Shopify/salesforce_chunker
127
127
  licenses:
128
128
  - MIT
129
129
  metadata: {}
130
- post_install_message:
130
+ post_install_message:
131
131
  rdoc_options: []
132
132
  require_paths:
133
133
  - lib
@@ -143,7 +143,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
143
143
  version: '0'
144
144
  requirements: []
145
145
  rubygems_version: 3.0.2
146
- signing_key:
146
+ signing_key:
147
147
  specification_version: 4
148
148
  summary: Salesforce Bulk API Client
149
149
  test_files: []