salesforce_chunker 1.2.0 → 1.2.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: bce2e320c05d4d8a6cd855eeedc8eac3bb2fd02a0a2d90be3743d156fc326127
4
- data.tar.gz: 88e0b977efdf78f298adceb9a09b5d31b67ecd59e58884e9ac2d649de15ac30b
3
+ metadata.gz: 8c051e8de05fca9caf7049b418d7366752889787689fe22502f0eb74131cdbf1
4
+ data.tar.gz: 0df0dffe1900e42f584d3e7c067e2c643582707f6468d6f988743fb6466f506c
5
5
  SHA512:
6
- metadata.gz: 3294b08e552ac9dad461f13c9678ee48823f8fc3a708db6acabbf6574ad6f0715bcb0947fee78836d9f0d212ec713b555ab017268664d25ee40c1e75a78f939a
7
- data.tar.gz: addb0a2b786fc330f1366582e616bf9e63677fe6664873d14189f923c9a96527ee50a17685aab2b8f021b548cbb0afaa20f59eaf6b85a7785633a4f62b5d599f
6
+ metadata.gz: 46ce51f398903355664620f9ee5274a28b5e74691d5ec5e1f5ff786410a530d547e07d7d35bfae77eb0de34ba690afc1a1732d2a8ec3eff81c87b954e4fe753b
7
+ data.tar.gz: bc1dc6ebb441f0a1b5e0253a8afd5b15a8d3440d8729228e0a62e136fbdfc1a22be5225908ec863461579709a826e16fa552ee7b3846e0ca4606844e749c6858
data/CHANGELOG.md CHANGED
@@ -1,10 +1,16 @@
1
1
  # CHANGELOG
2
2
 
3
+ ## 1.2.1 - 2019-06-26
4
+
5
+ - Fixed bug in Manual Chunking that could result in larger batches.
6
+ - Added IOError to the types of errors that are retried.
7
+ - Removed circular reference and warning about it.
8
+
3
9
  ## 1.2.0 - 2019-06-14
4
10
 
5
11
  - Added an include_deleted flag to perform a queryAll operation.
6
12
  - Disabled explicit GZIP encoding to work with the latest versions of HTTParty.
7
- - Added a retry for requests to recover from Net::ReadTimeout errors
13
+ - Added a retry for requests to recover from Net::ReadTimeout errors.
8
14
 
9
15
  ## 1.1.1 - 2018-11-26
10
16
 
data/Gemfile.lock CHANGED
@@ -1,7 +1,7 @@
1
1
  PATH
2
2
  remote: .
3
3
  specs:
4
- salesforce_chunker (1.2.0)
4
+ salesforce_chunker (1.2.1)
5
5
  httparty (~> 0.15)
6
6
 
7
7
  GEM
data/README.md CHANGED
@@ -55,6 +55,9 @@ client = SalesforceChunker::Client.new(
55
55
  | security_token | may be required depending on your Salesforce setup |
56
56
  | domain | optional. defaults to `"login"`. |
57
57
  | salesforce_version | optional. defaults to `"42.0"`. Must be >= `"33.0"` to use PK Chunking. |
58
+ | logger | optional. logger to use. Must be instance of or similar to rails logger. Use here if you want to log all API page requests. |
59
+ | log_output | optional. log output to use. i.e. `STDOUT`. |
60
+
58
61
 
59
62
  #### Functions
60
63
 
@@ -63,6 +66,7 @@ client = SalesforceChunker::Client.new(
63
66
  | query |
64
67
  | single_batch_query | calls `query(job_type: "single_batch", **options)` |
65
68
  | primary_key_chunking_query | calls `query(job_type: "primary_key_chunking", **options)` |
69
+ | manual_chunking_query | calls `query(job_type: "manual_chunking", **options)` |
66
70
 
67
71
  #### Query
68
72
 
@@ -88,12 +92,12 @@ end
88
92
  | --- | --- | --- |
89
93
  | query | required | SOQL query. |
90
94
  | object | required | Salesforce Object type. |
91
- | batch_size | optional | defaults to `100000`. Number of records to process in a batch. (Only for PK Chunking) |
95
+ | batch_size | optional | defaults to `100000`. Number of records to process in a batch. (Not used in Single Batch jobs) |
92
96
  | retry_seconds | optional | defaults to `10`. Number of seconds to wait before querying API for updated results. |
93
- | timeout_seconds | optional | defaults to `3600`. Number of seconds to wait before query is killed. |
97
+ | timeout_seconds | optional | defaults to `3600`. Number of seconds to wait for a batch to process before job is killed. |
94
98
  | logger | optional | logger to use. Must be instance of or similar to rails logger. |
95
99
  | log_output | optional | log output to use. i.e. `STDOUT`. |
96
- | job_type | optional | defaults to `"primary_key_chunking"`. Can also be set to `"single_batch"`. |
100
+ | job_type | optional | defaults to `"primary_key_chunking"`. Can also be set to `"single_batch"` or `"manual_chunking`. |
97
101
  | include_deleted | optional | defaults to `false`. Whether to include deleted records. |
98
102
 
99
103
  `query` can either be called with a block, or will return an enumerator:
@@ -102,6 +106,34 @@ end
102
106
  names = client.query(query, object, options).map { |result| result["Name"] }
103
107
  ```
104
108
 
109
+ ### A discussion about Single Batch, Primary Key Chunking, and Manual Chunking job types.
110
+
111
+ One of the advantages of the Salesforce Bulk API over the other Salesforce APIs is the ability for Salesforce to process a number of requests (either queries or uploads) in parallel on their servers. The request chunks are referred to as batches.
112
+
113
+ #### Single Batch Query
114
+
115
+ In a single batch query, one SOQL statement is executed as a single batch. This works best if the total number of records to return is fewer than around 100,000 depending on memory usage and the number of fields being returned.
116
+
117
+ #### Primary Key Chunking Query
118
+
119
+ In Primary Key Chunking, the internal Salesforce PK chunking flag is used. Salesforce will create a number of batches automatically based on an internal Id index. See https://developer.salesforce.com/docs/atlas.en-us.api_asynch.meta/api_asynch/async_api_headers_enable_pk_chunking.htm
120
+
121
+ #### Manual Chunking Query
122
+
123
+ This approach is called "Manual" Chunking because it is our own implementation of PK Chunking in this gem. The gem downloads a CSV ordered list of all Ids it needs to download, and then uses this list to generate breakpoints that it uses to create batches.
124
+
125
+ #### Primary Key Chunking Query vs Manual Chunking Query
126
+
127
+ Advantages of Manual Chunking:
128
+
129
+ - Manual Chunking takes into account the where clause in the SOQL statement. For example, if you are filtering a small number of a large object count, say 250k out of 20M Objects, then Manual Chunking will split this into 3 batches of max 100k while PK chunking will split this into 200 batches, which will use up batches and API requests against your account and take a longer amount of time.
130
+ - Any object can use Manual Chunking (according to Salesforce, PK chunking is supported for the following objects: Account, Asset, Campaign, CampaignMember, Case, CaseHistory, Contact, Event, EventRelation, Lead, LoginHistory, Opportunity, Task, User, and custom objects.)
131
+
132
+ Advantages of Primary Key Chunking:
133
+
134
+ - Primary Key Chunking appears to be slightly faster, if using a PK Chunking eligible object and no where clause.
135
+ - Primary Key Chunking may be less buggy because many more people depend on the Salesforce API than this gem.
136
+
105
137
  ### Under the hood: SalesforceChunker::Job
106
138
 
107
139
  Using `SalesforceChunker::Job`, you have more direct access to the Salesforce Bulk API functions, such as `create_batch`, `get_batch_statuses`, and `retrieve_batch_results`. This can be used to perform custom tasks, such as upserts or multiple batch queries.
@@ -32,7 +32,7 @@ module SalesforceChunker
32
32
 
33
33
  def post(url, body, headers={})
34
34
  @log.info "POST: #{url}"
35
- response = self.class.retry_block(log: @log, rescues: Net::ReadTimeout) do
35
+ response = self.class.retry_block(log: @log) do
36
36
  HTTParty.post(@base_url + url, headers: @default_headers.merge(headers), body: body)
37
37
  end
38
38
  self.class.check_response_error(response.parsed_response)
@@ -40,7 +40,7 @@ module SalesforceChunker
40
40
 
41
41
  def get_json(url, headers={})
42
42
  @log.info "GET: #{url}"
43
- response = self.class.retry_block(log: @log, rescues: Net::ReadTimeout) do
43
+ response = self.class.retry_block(log: @log) do
44
44
  HTTParty.get(@base_url + url, headers: @default_headers.merge(headers))
45
45
  end
46
46
  self.class.check_response_error(response.parsed_response)
@@ -48,7 +48,7 @@ module SalesforceChunker
48
48
 
49
49
  def get(url, headers={})
50
50
  @log.info "GET: #{url}"
51
- self.class.retry_block(log: @log, rescues: Net::ReadTimeout) do
51
+ self.class.retry_block(log: @log) do
52
52
  HTTParty.get(@base_url + url, headers: @default_headers.merge(headers)).body
53
53
  end
54
54
  end
@@ -85,8 +85,9 @@ module SalesforceChunker
85
85
 
86
86
  MAX_TRIES = 5
87
87
  SLEEP_DURATION = 10
88
+ RESCUED_EXCEPTIONS = [Net::ReadTimeout, IOError]
88
89
 
89
- def self.retry_block(log: log, tries: MAX_TRIES, sleep_duration: SLEEP_DURATION, rescues:, &block)
90
+ def self.retry_block(log: Logger.new(nil), tries: MAX_TRIES, sleep_duration: SLEEP_DURATION, rescues: RESCUED_EXCEPTIONS, &block)
90
91
  attempt_number = 1
91
92
 
92
93
  begin
@@ -12,18 +12,20 @@ module SalesforceChunker
12
12
  end
13
13
 
14
14
  def get_batch_results(batch_id)
15
- retrieve_batch_results(batch_id).each do |result_id|
15
+ retrieve_batch_results(batch_id).each_with_index do |result_id, result_index|
16
16
  results = retrieve_raw_results(batch_id, result_id)
17
17
 
18
18
  @log.info "Generating breakpoints from CSV results"
19
- process_csv_results(results) { |result| yield result }
19
+ process_csv_results(results, result_index > 0) { |result| yield result }
20
20
  end
21
21
  end
22
22
 
23
- def process_csv_results(result)
24
- lines = result.each_line
23
+ def process_csv_results(input, include_first_element)
24
+ lines = input.each_line
25
25
  headers = lines.next
26
26
 
27
+ yield(lines.peek.chomp.gsub("\"", "")) if include_first_element
28
+
27
29
  loop do
28
30
  @batch_size.times { lines.next }
29
31
  yield(lines.peek.chomp.gsub("\"", ""))
@@ -1,3 +1,3 @@
1
1
  module SalesforceChunker
2
- VERSION = "1.2.0"
2
+ VERSION = "1.2.1"
3
3
  end
metadata CHANGED
@@ -1,94 +1,94 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: salesforce_chunker
3
3
  version: !ruby/object:Gem::Version
4
- version: 1.2.0
4
+ version: 1.2.1
5
5
  platform: ruby
6
6
  authors:
7
7
  - Curtis Holmes
8
- autorequire:
8
+ autorequire:
9
9
  bindir: exe
10
10
  cert_chain: []
11
- date: 2019-06-14 00:00:00.000000000 Z
11
+ date: 2019-06-26 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
- name: httparty
15
14
  requirement: !ruby/object:Gem::Requirement
16
15
  requirements:
17
16
  - - "~>"
18
17
  - !ruby/object:Gem::Version
19
18
  version: '0.15'
20
- type: :runtime
19
+ name: httparty
21
20
  prerelease: false
21
+ type: :runtime
22
22
  version_requirements: !ruby/object:Gem::Requirement
23
23
  requirements:
24
24
  - - "~>"
25
25
  - !ruby/object:Gem::Version
26
26
  version: '0.15'
27
27
  - !ruby/object:Gem::Dependency
28
- name: bundler
29
28
  requirement: !ruby/object:Gem::Requirement
30
29
  requirements:
31
30
  - - "~>"
32
31
  - !ruby/object:Gem::Version
33
32
  version: '1.16'
34
- type: :development
33
+ name: bundler
35
34
  prerelease: false
35
+ type: :development
36
36
  version_requirements: !ruby/object:Gem::Requirement
37
37
  requirements:
38
38
  - - "~>"
39
39
  - !ruby/object:Gem::Version
40
40
  version: '1.16'
41
41
  - !ruby/object:Gem::Dependency
42
- name: rake
43
42
  requirement: !ruby/object:Gem::Requirement
44
43
  requirements:
45
44
  - - "~>"
46
45
  - !ruby/object:Gem::Version
47
46
  version: '10.0'
48
- type: :development
47
+ name: rake
49
48
  prerelease: false
49
+ type: :development
50
50
  version_requirements: !ruby/object:Gem::Requirement
51
51
  requirements:
52
52
  - - "~>"
53
53
  - !ruby/object:Gem::Version
54
54
  version: '10.0'
55
55
  - !ruby/object:Gem::Dependency
56
- name: minitest
57
56
  requirement: !ruby/object:Gem::Requirement
58
57
  requirements:
59
58
  - - "~>"
60
59
  - !ruby/object:Gem::Version
61
60
  version: '5.0'
62
- type: :development
61
+ name: minitest
63
62
  prerelease: false
63
+ type: :development
64
64
  version_requirements: !ruby/object:Gem::Requirement
65
65
  requirements:
66
66
  - - "~>"
67
67
  - !ruby/object:Gem::Version
68
68
  version: '5.0'
69
69
  - !ruby/object:Gem::Dependency
70
- name: mocha
71
70
  requirement: !ruby/object:Gem::Requirement
72
71
  requirements:
73
72
  - - "~>"
74
73
  - !ruby/object:Gem::Version
75
74
  version: '1.5'
76
- type: :development
75
+ name: mocha
77
76
  prerelease: false
77
+ type: :development
78
78
  version_requirements: !ruby/object:Gem::Requirement
79
79
  requirements:
80
80
  - - "~>"
81
81
  - !ruby/object:Gem::Version
82
82
  version: '1.5'
83
83
  - !ruby/object:Gem::Dependency
84
- name: pry
85
84
  requirement: !ruby/object:Gem::Requirement
86
85
  requirements:
87
86
  - - "~>"
88
87
  - !ruby/object:Gem::Version
89
88
  version: '0.11'
90
- type: :development
89
+ name: pry
91
90
  prerelease: false
91
+ type: :development
92
92
  version_requirements: !ruby/object:Gem::Requirement
93
93
  requirements:
94
94
  - - "~>"
@@ -127,7 +127,7 @@ homepage: https://github.com/Shopify/salesforce_chunker
127
127
  licenses:
128
128
  - MIT
129
129
  metadata: {}
130
- post_install_message:
130
+ post_install_message:
131
131
  rdoc_options: []
132
132
  require_paths:
133
133
  - lib
@@ -143,7 +143,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
143
143
  version: '0'
144
144
  requirements: []
145
145
  rubygems_version: 3.0.2
146
- signing_key:
146
+ signing_key:
147
147
  specification_version: 4
148
148
  summary: Salesforce Bulk API Client
149
149
  test_files: []