salesforce_bulk_api 1.2.0 → 1.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
data/CHANGELOG.md ADDED
File without changes
data/README.md CHANGED
@@ -9,10 +9,16 @@
9
9
  - [Authentication](#authentication)
10
10
  - [Usage](#usage)
11
11
  - [Basic Operations](#basic-operations)
12
+ - [Method Parameters](#method-parameters)
13
+ - [Getting Results](#getting-results)
14
+ - [Null Value Handling](#null-value-handling)
12
15
  - [Job Management](#job-management)
16
+ - [Batch Operations](#batch-operations)
13
17
  - [Event Listening](#event-listening)
14
- - [Retrieving Batch Records](#retrieving-batch-records)
15
18
  - [API Call Throttling](#api-call-throttling)
19
+ - [Monitoring and Counters](#monitoring-and-counters)
20
+ - [Error Handling](#error-handling)
21
+ - [Advanced Features](#advanced-features)
16
22
  - [Contributing](#contributing)
17
23
  - [License](#license)
18
24
 
@@ -20,6 +26,14 @@
20
26
 
21
27
  `SalesforceBulkApi` is a Ruby wrapper for the Salesforce Bulk API. It is rewritten from [salesforce_bulk](https://github.com/jorgevaldivia/salesforce_bulk) and adds several missing features, making it easier to perform bulk operations with Salesforce from Ruby applications.
22
28
 
29
+ Key features:
30
+ - Support for all Bulk API operations (create, update, upsert, delete, query)
31
+ - Comprehensive error handling
32
+ - Job and batch status monitoring
33
+ - Event listening for job lifecycle
34
+ - API call throttling and monitoring
35
+ - Performance optimized string concatenation for large batches
36
+
23
37
  ## Installation
24
38
 
25
39
  Add this line to your application's Gemfile:
@@ -95,8 +109,12 @@ salesforce = SalesforceBulkApi::Api.new(client)
95
109
  ```ruby
96
110
  new_account = { "name" => "Test Account", "type" => "Other" }
97
111
  records_to_insert = [new_account]
112
+
113
+ # Basic usage
98
114
  result = salesforce.create("Account", records_to_insert)
99
- puts "Result: #{result.inspect}"
115
+
116
+ # With response and custom batch size
117
+ result = salesforce.create("Account", records_to_insert, true, false, [], 5000)
100
118
  ```
101
119
 
102
120
  #### Update Records
@@ -104,7 +122,12 @@ puts "Result: #{result.inspect}"
104
122
  ```ruby
105
123
  updated_account = { "name" => "Test Account -- Updated", "id" => "a00A0001009zA2m" }
106
124
  records_to_update = [updated_account]
125
+
126
+ # Basic usage
107
127
  salesforce.update("Account", records_to_update)
128
+
129
+ # With null handling
130
+ salesforce.update("Account", records_to_update, true, true, ["Phone"])
108
131
  ```
109
132
 
110
133
  #### Upsert Records
@@ -112,7 +135,12 @@ salesforce.update("Account", records_to_update)
112
135
  ```ruby
113
136
  upserted_account = { "name" => "Test Account -- Upserted", "External_Field_Name" => "123456" }
114
137
  records_to_upsert = [upserted_account]
138
+
139
+ # Basic usage
115
140
  salesforce.upsert("Account", records_to_upsert, "External_Field_Name")
141
+
142
+ # With all options
143
+ result = salesforce.upsert("Account", records_to_upsert, "External_Field_Name", true, false, [], 10000, 3600)
116
144
  ```
117
145
 
118
146
  #### Delete Records
@@ -120,51 +148,319 @@ salesforce.upsert("Account", records_to_upsert, "External_Field_Name")
120
148
  ```ruby
121
149
  deleted_account = { "id" => "a00A0001009zA2m" }
122
150
  records_to_delete = [deleted_account]
151
+
152
+ # Basic usage
123
153
  salesforce.delete("Account", records_to_delete)
154
+
155
+ # With response
156
+ result = salesforce.delete("Account", records_to_delete, true)
124
157
  ```
125
158
 
126
159
  #### Query Records
127
160
 
128
161
  ```ruby
129
- res = salesforce.query("Account", "SELECT id, name, createddate FROM Account LIMIT 3")
162
+ result = salesforce.query("Account", "SELECT id, name, createddate FROM Account LIMIT 3")
163
+ puts "Records found: #{result["batches"][0]["response"].length}"
164
+ ```
165
+
166
+ ### Method Parameters
167
+
168
+ All bulk operation methods support additional parameters for fine-tuned control:
169
+
170
+ #### Complete Method Signatures:
171
+
172
+ ```ruby
173
+ # CREATE
174
+ salesforce.create(sobject, records, get_response=false, send_nulls=false, no_null_list=[], batch_size=10000, timeout=1500)
175
+
176
+ # UPDATE
177
+ salesforce.update(sobject, records, get_response=false, send_nulls=false, no_null_list=[], batch_size=10000, timeout=1500)
178
+
179
+ # UPSERT
180
+ salesforce.upsert(sobject, records, external_field, get_response=false, send_nulls=false, no_null_list=[], batch_size=10000, timeout=1500)
181
+
182
+ # DELETE
183
+ salesforce.delete(sobject, records, get_response=false, batch_size=10000, timeout=1500)
184
+
185
+ # QUERY
186
+ salesforce.query(sobject, query_string, batch_size=10000, timeout=1500)
187
+ ```
188
+
189
+ #### Parameter Descriptions:
190
+
191
+ - **`get_response`** (Boolean): Whether to return batch processing results (default: false)
192
+ - **`send_nulls`** (Boolean): Whether to send null/empty values to Salesforce (default: false)
193
+ - **`no_null_list`** (Array): Fields to exclude from null value handling when `send_nulls` is true
194
+ - **`batch_size`** (Integer): Number of records per batch (default: 10000, max: 10000)
195
+ - **`timeout`** (Integer): Timeout in seconds for job completion (default: 1500)
196
+
197
+ ### Getting Results
198
+
199
+ When `get_response` is set to true, you'll receive detailed results:
200
+
201
+ ```ruby
202
+ result = salesforce.create("Account", records, true)
203
+
204
+ # Access job information
205
+ puts "Job ID: #{result['job_id']}"
206
+ puts "Job state: #{result['state']}"
207
+
208
+ # Access batch results
209
+ result["batches"].each_with_index do |batch, index|
210
+ puts "Batch #{index + 1}:"
211
+ puts " State: #{batch['state'][0]}"
212
+ puts " Records processed: #{batch['numberRecordsProcessed'][0]}"
213
+
214
+ if batch["response"]
215
+ batch["response"].each do |record|
216
+ if record["success"] == ["true"]
217
+ puts " ✓ Success: #{record['id'][0]}"
218
+ else
219
+ puts " ✗ Error: #{record['errors'][0]['message'][0]}"
220
+ end
221
+ end
222
+ end
223
+ end
224
+ ```
225
+
226
+ ### Null Value Handling
227
+
228
+ Control how null and empty values are handled:
229
+
230
+ ```ruby
231
+ records = [
232
+ { "Id" => "001...", "Name" => "Test", "Phone" => "", "Website" => nil }
233
+ ]
234
+
235
+ # Send nulls for empty/nil fields, except for Phone
236
+ result = salesforce.update("Account", records, true, true, ["Phone"])
237
+
238
+ # This will:
239
+ # - Set Website to NULL in Salesforce (because it's nil)
240
+ # - Leave Phone unchanged (because it's in no_null_list)
241
+ # - Update Name normally
130
242
  ```
131
243
 
132
244
  ### Job Management
133
245
 
134
- You can check the status of a job using its ID:
246
+ #### Get Job by ID
135
247
 
136
248
  ```ruby
137
- job = salesforce.job_from_id('a00A0001009zA2m')
138
- puts "Status: #{job.check_job_status.inspect}"
249
+ job = salesforce.job_from_id('750A0000001234567')
250
+ status = job.check_job_status
251
+ puts "Job state: #{status['state'][0]}"
252
+ puts "Batches total: #{status['numberBatchesTotal'][0]}"
139
253
  ```
140
254
 
141
- ### Event Listening
255
+ #### Check Job Status
256
+
257
+ ```ruby
258
+ job = salesforce.job_from_id(job_id)
259
+ status = job.check_job_status
260
+
261
+ puts "Job Information:"
262
+ puts " State: #{status['state'][0]}"
263
+ puts " Object: #{status['object'][0]}"
264
+ puts " Operation: #{status['operation'][0]}"
265
+ puts " Total Batches: #{status['numberBatchesTotal'][0]}"
266
+ puts " Completed Batches: #{status['numberBatchesCompleted'][0]}"
267
+ puts " Failed Batches: #{status['numberBatchesFailed'][0]}"
268
+ ```
269
+
270
+ ### Batch Operations
142
271
 
143
- You can listen for job creation events:
272
+ #### Check Batch Status
144
273
 
145
274
  ```ruby
146
- salesforce.on_job_created do |job|
147
- puts "Job #{job.job_id} created!"
275
+ job = salesforce.job_from_id(job_id)
276
+ batch_status = job.check_batch_status(batch_id)
277
+
278
+ puts "Batch Information:"
279
+ puts " State: #{batch_status['state'][0]}"
280
+ puts " Records Processed: #{batch_status['numberRecordsProcessed'][0]}"
281
+ puts " Records Failed: #{batch_status['numberRecordsFailed'][0]}"
282
+ ```
283
+
284
+ #### Retrieve Batch Records
285
+
286
+ ```ruby
287
+ job = salesforce.job_from_id(job_id)
288
+ records = job.get_batch_records(batch_id)
289
+
290
+ puts "Batch Records:"
291
+ records.each do |record|
292
+ puts " #{record.inspect}"
148
293
  end
149
294
  ```
150
295
 
151
- ### Retrieving Batch Records
296
+ #### Get Batch Results
152
297
 
153
- Fetch records from a specific batch in a job:
298
+ ```ruby
299
+ job = salesforce.job_from_id(job_id)
300
+ results = job.get_batch_result(batch_id)
301
+
302
+ results.each do |result|
303
+ if result["success"] == ["true"]
304
+ puts "Success: Record ID #{result['id'][0]}"
305
+ else
306
+ puts "Failed: #{result['errors'][0]['message'][0]}"
307
+ end
308
+ end
309
+ ```
310
+
311
+ ### Event Listening
312
+
313
+ Listen for job creation events:
154
314
 
155
315
  ```ruby
156
- job_id = 'l02A0231009Za8m'
157
- batch_id = 'H24a0708089zA2J'
158
- records = salesforce.get_batch_records(job_id, batch_id)
316
+ salesforce.on_job_created do |job|
317
+ puts "Job #{job.job_id} created for #{job.operation} on #{job.sobject}!"
318
+
319
+ # You can perform additional operations here
320
+ # like logging, notifications, etc.
321
+ end
322
+
323
+ # Now when you create/update/etc, the listener will be called
324
+ result = salesforce.create("Account", records)
159
325
  ```
160
326
 
161
327
  ### API Call Throttling
162
328
 
163
- You can control how frequently status checks are performed:
329
+ Control the frequency of status checks to avoid hitting API limits:
164
330
 
165
331
  ```ruby
166
- # Set status check interval to 30 seconds
332
+ # Set status check interval to 30 seconds (default is 5 seconds)
167
333
  salesforce.connection.set_status_throttle(30)
334
+
335
+ # Check current throttle setting
336
+ puts "Current throttle: #{salesforce.connection.get_status_throttle} seconds"
337
+ ```
338
+
339
+ ### Monitoring and Counters
340
+
341
+ Track API usage and operations:
342
+
343
+ ```ruby
344
+ # Get operation counters
345
+ counters = salesforce.counters
346
+ puts "API Usage: #{counters}"
347
+ # => {:http_get=>15, :http_post=>8, :upsert=>2, :update=>1, :create=>3, :delete=>0, :query=>2}
348
+
349
+ # Reset counters
350
+ salesforce.reset_counters
351
+ ```
352
+
353
+ ## Error Handling
354
+
355
+ The gem provides comprehensive error handling:
356
+
357
+ ```ruby
358
+ begin
359
+ result = salesforce.create("Account", records, true)
360
+
361
+ # Check for batch-level errors
362
+ result["batches"].each do |batch|
363
+ if batch["state"][0] == "Failed"
364
+ puts "Batch failed: #{batch["stateMessage"][0]}"
365
+ end
366
+ end
367
+
368
+ rescue SalesforceBulkApi::Job::SalesforceException => e
369
+ puts "Salesforce API error: #{e.message}"
370
+ # Handle API-level errors (invalid objects, fields, etc.)
371
+
372
+ rescue SalesforceBulkApi::JobTimeout => e
373
+ puts "Job timed out: #{e.message}"
374
+ # Handle timeout errors - job took longer than specified timeout
375
+
376
+ rescue => e
377
+ puts "Unexpected error: #{e.message}"
378
+ # Handle other errors (network issues, authentication, etc.)
379
+ end
380
+ ```
381
+
382
+ ### Common Error Scenarios
383
+
384
+ ```ruby
385
+ # Invalid field names
386
+ begin
387
+ records = [{ "InvalidField__c" => "value" }]
388
+ salesforce.create("Account", records, true)
389
+ rescue SalesforceBulkApi::Job::SalesforceException => e
390
+ puts "Field error: #{e.message}"
391
+ end
392
+
393
+ # Malformed record IDs
394
+ begin
395
+ records = [{ "Id" => "invalid_id" }]
396
+ salesforce.update("Account", records, true)
397
+ rescue => e
398
+ # This might not raise immediately - check batch results
399
+ result = salesforce.update("Account", records, true)
400
+ failed_records = result["batches"][0]["response"].select { |r| r["success"] == ["false"] }
401
+ failed_records.each { |r| puts "Failed: #{r['errors'][0]['message'][0]}" }
402
+ end
403
+ ```
404
+
405
+ ## Advanced Features
406
+
407
+ ### Relationship Fields
408
+
409
+ You can work with relationship fields using dot notation:
410
+
411
+ ```ruby
412
+ # Create records with relationship data
413
+ records = [
414
+ {
415
+ "Name" => "Test Account",
416
+ "Parent.Name" => "Parent Account Name",
417
+ "Owner.Email" => "owner@example.com"
418
+ }
419
+ ]
420
+
421
+ result = salesforce.create("Account", records, true)
422
+ ```
423
+
424
+ ### Special Data Types
425
+
426
+ The gem automatically handles various data types:
427
+
428
+ ```ruby
429
+ records = [
430
+ {
431
+ "Name" => "Test Account",
432
+ "AnnualRevenue" => 1000000, # Numbers
433
+ "IsActive__c" => true, # Booleans
434
+ "LastModifiedDate" => Time.now, # Timestamps (converted to ISO8601)
435
+ "Description" => "Text with <special> chars" # XML encoding handled automatically
436
+ }
437
+ ]
438
+ ```
439
+
440
+ ### Large Dataset Handling
441
+
442
+ For large datasets, the gem automatically handles batching:
443
+
444
+ ```ruby
445
+ # This will be automatically split into multiple batches of 10,000 records each
446
+ large_dataset = (1..50000).map { |i| { "Name" => "Account #{i}" } }
447
+
448
+ result = salesforce.create("Account", large_dataset, true, false, [], 10000, 7200) # 2 hour timeout
449
+ puts "Created #{result['batches'].length} batches"
450
+ ```
451
+
452
+ ### Custom Batch Sizes
453
+
454
+ Optimize for your use case:
455
+
456
+ ```ruby
457
+ # Smaller batches for complex records
458
+ complex_records = [...]
459
+ salesforce.create("CustomObject__c", complex_records, true, false, [], 2000)
460
+
461
+ # Larger batches for simple records (up to 10,000)
462
+ simple_records = [...]
463
+ salesforce.create("Account", simple_records, true, false, [], 10000)
168
464
  ```
169
465
 
170
466
  ## Contributing
@@ -177,6 +473,24 @@ We welcome contributions to improve this gem. Feel free to:
177
473
  4. Push to the branch (`git push origin feature/amazing-feature`)
178
474
  5. Create a new Pull Request
179
475
 
476
+ ### Development Setup
477
+
478
+ ```bash
479
+ git clone https://github.com/yatish27/salesforce_bulk_api.git
480
+ cd salesforce_bulk_api
481
+ bundle install
482
+
483
+ # Copy environment template
484
+ cp .env.sample .env
485
+ # Edit .env with your Salesforce credentials
486
+
487
+ # Run tests
488
+ bundle exec rspec
489
+
490
+ # Run RuboCop
491
+ bundle exec rubocop
492
+ ```
493
+
180
494
  ## License
181
495
 
182
- This project is licensed under the MIT License, Copyright (c) 2025 - see the [LICENCE](LICENCE) file for details.
496
+ This project is licensed under the MIT License, Copyright (c) 2025 - see the [LICENCE](LICENCE) file for details.
@@ -19,10 +19,10 @@ module SalesforceBulkApi::Concerns
19
19
  key = extract_constraint_key_from(details, throttle_by_keys)
20
20
  last_request = limit_log[key]
21
21
 
22
- if last_request && only_if.call(details)
22
+ if !last_request.nil? && only_if.call(details)
23
23
  seconds_since_last_request = Time.now.to_f - last_request.to_f
24
24
  need_to_wait_seconds = limit_seconds - seconds_since_last_request
25
- sleep(need_to_wait_seconds) if need_to_wait_seconds.positive?
25
+ sleep(need_to_wait_seconds) if need_to_wait_seconds > 0
26
26
  end
27
27
 
28
28
  limit_log[key] = Time.now
@@ -32,17 +32,26 @@ module SalesforceBulkApi::Concerns
32
32
  private
33
33
 
34
34
  def extract_constraint_key_from(details, throttle_by_keys)
35
- throttle_by_keys.each_with_object({}) { |k, hash| hash[k] = details[k] }
35
+ hash = {}
36
+ throttle_by_keys.each { |k| hash[k] = details[k] }
37
+ hash
36
38
  end
37
39
 
38
40
  def get_limit_log(prune_older_than)
39
- @limits ||= {}
40
- @limits.delete_if { |_, v| v < prune_older_than }
41
+ @limits ||= Hash.new(0)
42
+
43
+ @limits.delete_if do |k, v|
44
+ v < prune_older_than
45
+ end
46
+
47
+ @limits
41
48
  end
42
49
 
43
50
  def throttle(details = {})
44
51
  (@throttles || []).each do |callback|
45
- callback.call(details)
52
+ args = [details]
53
+ args = args[0..callback.arity]
54
+ callback.call(*args)
46
55
  end
47
56
  end
48
57
  end
@@ -1,92 +1,96 @@
1
1
  require "timeout"
2
- require "net/https"
3
2
 
4
3
  module SalesforceBulkApi
5
4
  class Connection
6
5
  include Concerns::Throttling
7
6
 
8
- LOGIN_HOST = "login.salesforce.com".freeze
9
-
10
- attr_reader :session_id, :server_url, :instance, :instance_host
7
+ LOGIN_HOST = "login.salesforce.com"
11
8
 
12
9
  def initialize(api_version, client)
13
10
  @client = client
14
11
  @api_version = api_version
15
12
  @path_prefix = "/services/async/#{@api_version}/"
16
- @counters = Hash.new(0)
17
13
 
18
14
  login
19
15
  end
20
16
 
21
- def post_xml(host, path, xml, headers)
22
- host ||= @instance_host
23
- headers["X-SFDC-Session"] = @session_id unless host == LOGIN_HOST
24
- path = "#{@path_prefix}#{path}" unless host == LOGIN_HOST
25
-
26
- perform_request(:post, host, path, xml, headers)
27
- end
28
-
29
- def get_request(host, path, headers)
30
- host ||= @instance_host
31
- path = "#{@path_prefix}#{path}"
32
- headers["X-SFDC-Session"] = @session_id unless host == LOGIN_HOST
33
-
34
- perform_request(:get, host, path, nil, headers)
35
- end
36
-
37
- def counters
38
- {
39
- get: @counters[:get],
40
- post: @counters[:post]
41
- }
42
- end
43
-
44
- private
45
-
46
17
  def login
47
18
  client_type = @client.class.to_s
48
- @session_id, @server_url = if client_type == "Restforce::Data::Client"
49
- [@client.options[:oauth_token], @client.options[:instance_url]]
19
+ case client_type
20
+ when "Restforce::Data::Client"
21
+ @session_id = @client.options[:oauth_token]
22
+ @server_url = @client.options[:instance_url]
50
23
  else
51
- [@client.oauth_token, @client.instance_url]
24
+ @session_id = @client.oauth_token
25
+ @server_url = @client.instance_url
52
26
  end
53
27
  @instance = parse_instance
54
28
  @instance_host = "#{@instance}.salesforce.com"
55
29
  end
56
30
 
57
- def perform_request(method, host, path, body, headers)
58
- retries = 0
31
+ def post_xml(host, path, xml, headers)
32
+ host ||= @instance_host
33
+ if host != LOGIN_HOST # Not login, need to add session id to header
34
+ headers["X-SFDC-Session"] = @session_id
35
+ path = "#{@path_prefix}#{path}"
36
+ end
37
+ i = 0
59
38
  begin
60
- count(method)
61
- throttle(http_method: method, path: path)
62
- response = https(host).public_send(method, path, body, headers)
63
- response.body
64
- rescue => e
65
- retries += 1
66
- if retries < 3
67
- puts "Request fail #{retries}: Retrying #{path}"
39
+ count :post
40
+ throttle(http_method: :post, path: path)
41
+ https(host).post(path, xml, headers).body
42
+ rescue
43
+ i += 1
44
+ if i < 3
45
+ puts "Request fail #{i}: Retrying #{path}"
68
46
  retry
69
47
  else
70
48
  puts "FATAL: Request to #{path} failed three times."
71
- raise e
49
+ raise
72
50
  end
73
51
  end
74
52
  end
75
53
 
76
- def https(host)
77
- Net::HTTP.new(host, 443).tap do |http|
78
- http.use_ssl = true
79
- http.verify_mode = OpenSSL::SSL::VERIFY_NONE
54
+ def get_request(host, path, headers)
55
+ host ||= @instance_host
56
+ path = "#{@path_prefix}#{path}"
57
+ if host != LOGIN_HOST # Not login, need to add session id to header
58
+ headers["X-SFDC-Session"] = @session_id
80
59
  end
60
+
61
+ count :get
62
+ throttle(http_method: :get, path: path)
63
+ https(host).get(path, headers).body
64
+ end
65
+
66
+ def https(host)
67
+ req = Net::HTTP.new(host, 443)
68
+ req.use_ssl = true
69
+ req.verify_mode = OpenSSL::SSL::VERIFY_NONE
70
+ req
71
+ end
72
+
73
+ def counters
74
+ {
75
+ get: get_counters[:get],
76
+ post: get_counters[:post]
77
+ }
78
+ end
79
+
80
+ private
81
+
82
+ def get_counters
83
+ @counters ||= Hash.new { |hash, key| hash[key] = 0 }
81
84
  end
82
85
 
83
86
  def count(http_method)
84
- @counters[http_method] += 1
87
+ get_counters[http_method] += 1
85
88
  end
86
89
 
87
90
  def parse_instance
88
- instance = @server_url.match(%r{https://([a-z]{2}[0-9]{1,2})\.})&.captures&.first
89
- instance || @server_url.split(".salesforce.com").first.split("://").last
91
+ @instance = @server_url.match(/https:\/\/[a-z]{2}[0-9]{1,2}\./).to_s.gsub("https://", "").split(".")[0]
92
+ @instance = @server_url.split(".salesforce.com")[0].split("://")[1] if @instance.nil? || @instance.empty?
93
+ @instance
90
94
  end
91
95
  end
92
96
  end