redis_queued_locks 1.7.0 → 1.9.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (43) hide show
  1. checksums.yaml +4 -4
  2. data/.ruby-version +1 -1
  3. data/CHANGELOG.md +60 -1
  4. data/README.md +485 -46
  5. data/lib/redis_queued_locks/acquier/acquire_lock/dequeue_from_lock_queue/log_visitor.rb +4 -0
  6. data/lib/redis_queued_locks/acquier/acquire_lock/dequeue_from_lock_queue.rb +4 -1
  7. data/lib/redis_queued_locks/acquier/acquire_lock/instr_visitor.rb +20 -5
  8. data/lib/redis_queued_locks/acquier/acquire_lock/log_visitor.rb +24 -0
  9. data/lib/redis_queued_locks/acquier/acquire_lock/try_to_lock/log_visitor.rb +56 -0
  10. data/lib/redis_queued_locks/acquier/acquire_lock/try_to_lock.rb +37 -30
  11. data/lib/redis_queued_locks/acquier/acquire_lock/with_acq_timeout.rb +41 -7
  12. data/lib/redis_queued_locks/acquier/acquire_lock/yield_expire/log_visitor.rb +8 -0
  13. data/lib/redis_queued_locks/acquier/acquire_lock/yield_expire.rb +21 -9
  14. data/lib/redis_queued_locks/acquier/acquire_lock.rb +61 -22
  15. data/lib/redis_queued_locks/acquier/clear_dead_requests.rb +5 -1
  16. data/lib/redis_queued_locks/acquier/extend_lock_ttl.rb +5 -1
  17. data/lib/redis_queued_locks/acquier/lock_info.rb +4 -3
  18. data/lib/redis_queued_locks/acquier/locks.rb +2 -2
  19. data/lib/redis_queued_locks/acquier/queue_info.rb +2 -2
  20. data/lib/redis_queued_locks/acquier/release_all_locks.rb +12 -2
  21. data/lib/redis_queued_locks/acquier/release_lock.rb +12 -2
  22. data/lib/redis_queued_locks/client.rb +320 -10
  23. data/lib/redis_queued_locks/errors.rb +8 -0
  24. data/lib/redis_queued_locks/instrument.rb +8 -1
  25. data/lib/redis_queued_locks/logging.rb +8 -1
  26. data/lib/redis_queued_locks/resource.rb +59 -1
  27. data/lib/redis_queued_locks/swarm/acquirers.rb +44 -0
  28. data/lib/redis_queued_locks/swarm/flush_zombies.rb +133 -0
  29. data/lib/redis_queued_locks/swarm/probe_hosts.rb +69 -0
  30. data/lib/redis_queued_locks/swarm/redis_client_builder.rb +67 -0
  31. data/lib/redis_queued_locks/swarm/supervisor.rb +83 -0
  32. data/lib/redis_queued_locks/swarm/swarm_element/isolated.rb +287 -0
  33. data/lib/redis_queued_locks/swarm/swarm_element/threaded.rb +351 -0
  34. data/lib/redis_queued_locks/swarm/swarm_element.rb +8 -0
  35. data/lib/redis_queued_locks/swarm/zombie_info.rb +145 -0
  36. data/lib/redis_queued_locks/swarm.rb +241 -0
  37. data/lib/redis_queued_locks/utilities/lock.rb +22 -0
  38. data/lib/redis_queued_locks/utilities.rb +75 -0
  39. data/lib/redis_queued_locks/version.rb +2 -2
  40. data/lib/redis_queued_locks.rb +2 -0
  41. data/redis_queued_locks.gemspec +6 -10
  42. metadata +24 -6
  43. data/lib/redis_queued_locks/watcher.rb +0 -1
data/README.md CHANGED
@@ -34,6 +34,23 @@ Provides flexible invocation flow, parametrized limits (lock request ttl, lock t
34
34
  - [locks_info](#locks_info---get-list-of-locks-with-their-info)
35
35
  - [queues_info](#queues_info---get-list-of-queues-with-their-info)
36
36
  - [clear_dead_requests](#clear_dead_requests)
37
+ - [current_acquirer_id](#current_acquirer_id)
38
+ - [current_host_id](#current_host_id)
39
+ - [possible_host_ids](#possible_host_ids)
40
+ - [Swarm Mode and Zombie Locks](#swarm-mode-and-zombie-locks)
41
+ - [work and usage preview (temporary example-based docs)](#work-and-usage-preview-temporary-example-based-docs)
42
+ - [How to Swarm](#how-to-swarm)
43
+ - [configuration](#)
44
+ - [swarm_status](#swarm_status)
45
+ - [swarm_info](#swarm_info)
46
+ - [swarmize!](#swarmize!)
47
+ - [deswarmize!](#deswarmize!)
48
+ - [probe_hosts](#probe_hosts)
49
+ - [flush_zobmies](#flush_zombies)
50
+ - [zombies_info](#zombies_info)
51
+ - [zombie_locks](#zombie_locks)
52
+ - [zombie_hosts](#zombie_hosts)
53
+ - [zombie_acquiers](#zombie_acquiers)
37
54
  - [Lock Access Strategies](#lock-access-strategies)
38
55
  - [queued](#lock-access-strategies)
39
56
  - [random](#lock-access-strategies)
@@ -157,6 +174,29 @@ clinet = RedisQueuedLocks::Client.new(redis_client) do |config|
157
174
  # - should be all blocks of code are timed by default;
158
175
  config.is_timed_by_default = false
159
176
 
177
+ # (boolean) (default: false)
178
+ # - When the lock acquirement try reached the acquirement time limit (:timeout option) the
179
+ # `RedisQueuedLocks::LockAcquirementTimeoutError` is raised (when `raise_errors` option
180
+ # of the #lock method is set to `true`). The error message contains the lock key name and
181
+ # the timeout value).
182
+ # - <true> option adds the additional details to the error message:
183
+ # - current lock queue state (you can see which acquirer blocks your request and
184
+ # how much acquirers are in queue);
185
+ # - current lock data stored inside (for example: you can check the current acquirer and
186
+ # the lock meta state if you store some additional data there);
187
+ # - Realized as an option because of the additional lock data requires two additional Redis
188
+ # queries: (1) get the current lock from redis and (2) fetch the lock queue state;
189
+ # - These two additional Redis queries has async nature so you can receive
190
+ # inconsistent data of the lock and of the lock queue in your error emssage because:
191
+ # - required lock can be released after the error moment and before the error message build;
192
+ # - required lock can be obtained by other process after the error moment and
193
+ # before the error message build;
194
+ # - required lock queue can reach a state when the blocking acquirer start to obtain the lock
195
+ # and moved from the lock queue after the error moment and before the error message build;
196
+ # - You should consider the async nature of this error message and should use received data
197
+ # from error message correspondingly;
198
+ config.detailed_acq_timeout_error = false
199
+
160
200
  # (symbol) (default: :queued)
161
201
  # - Defines the way in which the lock should be obitained;
162
202
  # - By default it is configured to obtain a lock in classic `queued` way:
@@ -220,35 +260,35 @@ clinet = RedisQueuedLocks::Client.new(redis_client) do |config|
220
260
  # - should implement `debug(progname = nil, &block)` (minimal requirement) or be an instance of Ruby's `::Logger` class/subclass;
221
261
  # - supports `SemanticLogger::Logger` (see "semantic_logger" gem)
222
262
  # - at this moment the only debug logs are realised in following cases:
223
- # - "[redis_queued_locks.start_lock_obtaining]" (logs "lock_key", "queue_ttl", "acq_id", "acs_strat");
224
- # - "[redis_queued_locks.start_try_to_lock_cycle]" (logs "lock_key", "queue_ttl", "acq_id", "acs_strat");
225
- # - "[redis_queued_locks.dead_score_reached__reset_acquier_position]" (logs "lock_key", "queue_ttl", "acq_id", "acs_strat");
226
- # - "[redis_queued_locks.lock_obtained]" (logs "lock_key", "queue_ttl", "acq_id", "acq_time", "acs_strat");
227
- # - "[redis_queued_locks.extendable_reentrant_lock_obtained]" (logs "lock_key", "queue_ttl", "acq_id", "acq_time", "acs_strat");
228
- # - "[redis_queued_locks.reentrant_lock_obtained]" (logs "lock_key", "queue_ttl", "acq_id", "acq_time", "acs_strat");
229
- # - "[redis_queued_locks.fail_fast_or_limits_reached_or_deadlock__dequeue]" (logs "lock_key", "queue_ttl", "acq_id", "acs_strat");
230
- # - "[redis_queued_locks.expire_lock]" (logs "lock_key", "queue_ttl", "acq_id", "acs_strat");
231
- # - "[redis_queued_locks.decrease_lock]" (logs "lock_key", "decreased_ttl", "queue_ttl", "acq_id", "acs_strat");
263
+ # - "[redis_queued_locks.start_lock_obtaining]" (logs "lock_key", "queue_ttl", "acq_id", "hst_id", "acs_strat");
264
+ # - "[redis_queued_locks.start_try_to_lock_cycle]" (logs "lock_key", "queue_ttl", "acq_id", "hst_id", "acs_strat");
265
+ # - "[redis_queued_locks.dead_score_reached__reset_acquier_position]" (logs "lock_key", "queue_ttl", "acq_id", "hst_id", "acs_strat");
266
+ # - "[redis_queued_locks.lock_obtained]" (logs "lock_key", "queue_ttl", "acq_id", "hst_id", "acq_time", "acs_strat");
267
+ # - "[redis_queued_locks.extendable_reentrant_lock_obtained]" (logs "lock_key", "queue_ttl", "acq_id", "hst_id", "acq_time", "acs_strat");
268
+ # - "[redis_queued_locks.reentrant_lock_obtained]" (logs "lock_key", "queue_ttl", "acq_id", "hst_id", "acq_time", "acs_strat");
269
+ # - "[redis_queued_locks.fail_fast_or_limits_reached_or_deadlock__dequeue]" (logs "lock_key", "queue_ttl", "acq_id", "hst_id", "acs_strat");
270
+ # - "[redis_queued_locks.expire_lock]" (logs "lock_key", "queue_ttl", "acq_id", "hst_id", "acs_strat");
271
+ # - "[redis_queued_locks.decrease_lock]" (logs "lock_key", "decreased_ttl", "queue_ttl", "acq_id", "hst_id", "acs_strat");
232
272
  # - by default uses VoidLogger that does nothing;
233
273
  config.logger = RedisQueuedLocks::Logging::VoidLogger
234
274
 
235
275
  # (default: false)
236
276
  # - adds additional debug logs;
237
277
  # - enables additional logs for each internal try-retry lock acquiring (a lot of logs can be generated depending on your retry configurations);
238
- # - it adds following logs in addition to the existing:
239
- # - "[redis_queued_locks.try_lock.start]" (logs "lock_key", "queue_ttl", "acq_id", "acs_strat");
240
- # - "[redis_queued_locks.try_lock.rconn_fetched]" (logs "lock_key", "queue_ttl", "acq_id", "acs_strat");
241
- # - "[redis_queued_locks.try_lock.same_process_conflict_detected]" (logs "lock_key", "queue_ttl", "acq_id", "acs_strat");
242
- # - "[redis_queued_locks.try_lock.same_process_conflict_analyzed]" (logs "lock_key", "queue_ttl", "acq_id", "acs_strat", "spc_status");
243
- # - "[redis_queued_locks.try_lock.reentrant_lock__extend_and_work_through]" (logs "lock_key", "queue_ttl", "acq_id", "acs_strat", "spc_status", "last_ext_ttl", "last_ext_ts");
244
- # - "[redis_queued_locks.try_lock.reentrant_lock__work_through]" (logs "lock_key", "queue_ttl", "acq_id", "acs_strat", "spc_status", last_spc_ts);
245
- # - "[redis_queued_locks.try_lock.acq_added_to_queue]" (logs "lock_key", "queue_ttl", "acq_id, "acs_strat")";
246
- # - "[redis_queued_locks.try_lock.remove_expired_acqs]" (logs "lock_key", "queue_ttl", "acq_id", "acs_strat");
247
- # - "[redis_queued_locks.try_lock.get_first_from_queue]" (logs "lock_key", "queue_ttl", "acq_id", "acs_strat", "first_acq_id_in_queue");
248
- # - "[redis_queued_locks.try_lock.exit__queue_ttl_reached]" (logs "lock_key", "queue_ttl", "acq_id", "acs_strat");
249
- # - "[redis_queued_locks.try_lock.exit__no_first]" (logs "lock_key", "queue_ttl", "acq_id", "acs_strat", "first_acq_id_in_queue", "<current_lock_data>");
250
- # - "[redis_queued_locks.try_lock.exit__lock_still_obtained]" (logs "lock_key", "queue_ttl", "acq_id", "acs_strat", "first_acq_id_in_queue", "locked_by_acq_id", "<current_lock_data>");
251
- # - "[redis_queued_locks.try_lock.obtain__free_to_acquire]" (logs "lock_key", "queue_ttl", "acq_id", "acs_strat");
278
+ # - it adds following debug logs in addition to the existing:
279
+ # - "[redis_queued_locks.try_lock.start]" (logs "lock_key", "queue_ttl", "acq_id", "hst_id", "acs_strat");
280
+ # - "[redis_queued_locks.try_lock.rconn_fetched]" (logs "lock_key", "queue_ttl", "acq_id", "hst_id", "acs_strat");
281
+ # - "[redis_queued_locks.try_lock.same_process_conflict_detected]" (logs "lock_key", "queue_ttl", "acq_id", "hst_id", "acs_strat");
282
+ # - "[redis_queued_locks.try_lock.same_process_conflict_analyzed]" (logs "lock_key", "queue_ttl", "acq_id", "hst_id", "acs_strat", "spc_status");
283
+ # - "[redis_queued_locks.try_lock.reentrant_lock__extend_and_work_through]" (logs "lock_key", "queue_ttl", "acq_id", "hst_id", "acs_strat", "spc_status", "last_ext_ttl", "last_ext_ts");
284
+ # - "[redis_queued_locks.try_lock.reentrant_lock__work_through]" (logs "lock_key", "queue_ttl", "acq_id", "hst_id", "acs_strat", "spc_status", last_spc_ts);
285
+ # - "[redis_queued_locks.try_lock.acq_added_to_queue]" (logs "lock_key", "queue_ttl", "acq_id", "hst_id", "acs_strat")";
286
+ # - "[redis_queued_locks.try_lock.remove_expired_acqs]" (logs "lock_key", "queue_ttl", "acq_id", "hst_id", "acs_strat");
287
+ # - "[redis_queued_locks.try_lock.get_first_from_queue]" (logs "lock_key", "queue_ttl", "acq_id", "hst_id", "acs_strat", "first_acq_id_in_queue");
288
+ # - "[redis_queued_locks.try_lock.exit__queue_ttl_reached]" (logs "lock_key", "queue_ttl", "acq_id", "hst_id", "acs_strat");
289
+ # - "[redis_queued_locks.try_lock.exit__no_first]" (logs "lock_key", "queue_ttl", "acq_id", "hst_id", "acs_strat", "first_acq_id_in_queue", "<current_lock_data>");
290
+ # - "[redis_queued_locks.try_lock.exit__lock_still_obtained]" (logs "lock_key", "queue_ttl", "acq_id", "hst_id", "acs_strat", "first_acq_id_in_queue", "locked_by_acq_id", "<current_lock_data>");
291
+ # - "[redis_queued_locks.try_lock.obtain__free_to_acquire]" (logs "lock_key", "queue_ttl", "acq_id", "hst_id", "acs_strat");
252
292
  config.log_lock_try = false
253
293
 
254
294
  # (default: false)
@@ -316,6 +356,9 @@ end
316
356
  - [locks_info](#locks_info---get-list-of-locks-with-their-info)
317
357
  - [queues_info](#queues_info---get-list-of-queues-with-their-info)
318
358
  - [clear_dead_requests](#clear_dead_requests)
359
+ - [current_acquirer_id](#current_acquirer_id)
360
+ - [current_host_id](#current_host_id)
361
+ - [possible_host_ids](#possible_host_ids)
319
362
 
320
363
  ---
321
364
 
@@ -348,6 +391,7 @@ def lock(
348
391
  access_strategy: config[:default_access_strategy],
349
392
  identity: uniq_identity, # (attr_accessor) calculated during client instantiation via config[:uniq_identifier] proc;
350
393
  meta: nil,
394
+ detailed_acq_timeout_error: config[:detailed_acq_timeout_error],
351
395
  instrument: nil,
352
396
  instrumenter: config[:instrumenter],
353
397
  logger: config[:logger],
@@ -355,9 +399,11 @@ def lock(
355
399
  log_sampling_enabled: config[:log_sampling_enabled],
356
400
  log_sampling_percent: config[:log_sampling_percent],
357
401
  log_sampler: config[:log_sampler],
402
+ log_sample_this: false,
358
403
  instr_sampling_enabled: config[:instr_sampling_enabled],
359
404
  instr_sampling_percent: config[:instr_sampling_percent],
360
405
  instr_sampler: config[:instr_sampler],
406
+ instr_sample_this: false,
361
407
  &block
362
408
  )
363
409
  ```
@@ -433,6 +479,25 @@ def lock(
433
479
  - A custom metadata wich will be passed to the lock data in addition to the existing data;
434
480
  - Custom metadata can not contain reserved lock data keys (such as `lock_key`, `acq_id`, `ts`, `ini_ttl`, `rem_ttl`);
435
481
  - `nil` by default (means "no metadata");
482
+ - `detailed_acq_timeout_error` - (optional) `[Boolean]`
483
+ - When the lock acquirement try reached the acquirement time limit (:timeout option) the
484
+ `RedisQueuedLocks::LockAcquirementTimeoutError` is raised (when `raise_errors` option
485
+ set to `true`). The error message contains the lock key name and the timeout value).
486
+ - <true> option adds the additional details to the error message:
487
+ - current lock queue state (you can see which acquirer blocks your request and how much acquirers are in queue);
488
+ - current lock data stored inside (for example: you can check the current acquirer and the lock meta state if you store some additional data there);
489
+ - Realized as an option because of the additional lock data requires two additional Redis
490
+ queries: (1) get the current lock from redis and (2) fetch the lock queue state;
491
+ - These two additional Redis queries has async nature so you can receive
492
+ inconsistent data of the lock and of the lock queue in your error emssage because:
493
+ - required lock can be released after the error moment and before the error message build;
494
+ - required lock can be obtained by other process after the error moment and
495
+ before the error message build;
496
+ - required lock queue can reach a state when the blocking acquirer start to obtain the lock
497
+ and moved from the lock queue after the error moment and before the error message build;
498
+ - You should consider the async nature of this error message and should use received data
499
+ from error message correspondingly;
500
+ - pre-configred in `config[:detailed_acq_timeout_error]`;
436
501
  - `logger` - (optional) `[::Logger,#debug]`
437
502
  - Logger object used for loggin internal mutation oeprations and opertioan results / process progress;
438
503
  - pre-configured in `config[:logger]` with void logger `RedisQueuedLocks::Logging::VoidLogger`;
@@ -458,6 +523,10 @@ def lock(
458
523
  - you can provide your own log sampler with bettter algorithm that should realize
459
524
  `sampling_happened?(percent) => boolean` interface (see `RedisQueuedLocks::Logging::Sampler` for example);
460
525
  - pre-configured in `config[:log_sampler]`;
526
+ - `log_sample_this` - (optional) `[Boolean]`
527
+ - marks the method that everything should be logged despite the enabled log sampling;
528
+ - makes sense when log sampling is enabled;
529
+ - `false` by default;
461
530
  - `instr_sampling_enabled` - (optional) `[Boolean]`
462
531
  - enables **instrumentaion sampling**: only the configured percent of RQL cases will be instrumented;
463
532
  - disabled by default;
@@ -476,6 +545,10 @@ def lock(
476
545
  - you can provide your own log sampler with bettter algorithm that should realize
477
546
  `sampling_happened?(percent) => boolean` interface (see `RedisQueuedLocks::Instrument::Sampler` for example);
478
547
  - pre-configured in `config[:instr_sampler]`;
548
+ - `instr_sample_this` - (optional) `[Boolean]`
549
+ - marks the method that everything should be instrumneted despite the enabled instrumentation sampling;
550
+ - makes sense when instrumentation sampling is enabled;
551
+ - `false` by default;
479
552
  - `block` - (optional) `[Block]`
480
553
  - A block of code that should be executed after the successfully acquired lock.
481
554
  - If block is **passed** the obtained lock will be released after the block execution or it's ttl (what will happen first);
@@ -521,6 +594,7 @@ Return value:
521
594
  result: {
522
595
  lock_key: String, # acquierd lock key ("rql:lock:your_lock_name")
523
596
  acq_id: String, # acquier identifier ("process_id/thread_id/fiber_id/ractor_id/identity")
597
+ hst_id: String, # host identifier ("process_id/thread_id/ractor_id/identity")
524
598
  ts: Float, # time (epoch) when lock was obtained (float, Time#to_f)
525
599
  ttl: Integer, # lock's time to live in milliseconds (integer)
526
600
  process: Symbol # which logical process has acquired the lock (:lock_obtaining, :extendable_conflict_work_through, :conflict_work_through, :conflict_dead_lock)
@@ -535,6 +609,7 @@ Return value:
535
609
  result: {
536
610
  lock_key: "rql:lock:my_lock",
537
611
  acq_id: "rql:acq:26672/2280/2300/2320/70ea5dbf10ea1056",
612
+ acq_id: "rql:acq:26672/2280/2320/70ea5dbf10ea1056",
538
613
  ts: 1711909612.653696,
539
614
  ttl: 10000,
540
615
  process: :lock_obtaining # for custom conflict strategies may be: :conflict_dead_lock, :conflict_work_through, :extendable_conflict_work_through
@@ -610,6 +685,7 @@ rql.lock_info("my_lock")
610
685
  {
611
686
  "lock_key" => "rql:lock:my_lock",
612
687
  "acq_id" => "rql:acq:123/456/567/678/374dd74324",
688
+ "hst_id" => "rql:acq:123/456/678/374dd74324",
613
689
  "ts" => 123456789,
614
690
  "ini_ttl" => 123456,
615
691
  "rem_ttl" => 123440,
@@ -725,6 +801,7 @@ def lock!(
725
801
  fail_fast: false,
726
802
  identity: uniq_identity,
727
803
  meta: nil,
804
+ detailed_acq_timeout_error: config[:detailed_acq_timeout_error]
728
805
  logger: config[:logger],
729
806
  log_lock_try: config[:log_lock_try],
730
807
  instrument: nil,
@@ -734,9 +811,11 @@ def lock!(
734
811
  log_sampling_enabled: config[:log_sampling_enabled],
735
812
  log_sampling_percent: config[:log_sampling_percent],
736
813
  log_sampler: config[:log_sampler],
814
+ log_sample_this: false,
737
815
  instr_sampling_enabled: config[:instr_sampling_enabled],
738
816
  instr_sampling_percent: config[:instr_sampling_percent],
739
817
  instr_sampler: config[:instr_sampler],
818
+ instr_sample_this: false,
740
819
  &block
741
820
  )
742
821
  ```
@@ -754,6 +833,7 @@ See `#lock` method [documentation](#lock---obtain-a-lock).
754
833
  - lock data (`Hash<String,String|Integer>`):
755
834
  - `"lock_key"` - `string` - lock key in redis;
756
835
  - `"acq_id"` - `string` - acquier identifier (process_id/thread_id/fiber_id/ractor_id/identity);
836
+ - `"hst_id"` - `string` - host identifier (process_id/thread_id/ractor_id/identity);
757
837
  - `"ts"` - `numeric`/`epoch` - the time when lock was obtained;
758
838
  - `"init_ttl"` - `integer` - (milliseconds) initial lock key ttl;
759
839
  - `"rem_ttl"` - `integer` - (milliseconds) remaining lock key ttl;
@@ -776,6 +856,7 @@ rql.lock_info("your_lock_name")
776
856
  {
777
857
  "lock_key" => "rql:lock:your_lock_name",
778
858
  "acq_id" => "rql:acq:123/456/567/678/374dd74324",
859
+ "hst_id" => "rql:acq:123/456/678/374dd74324",
779
860
  "ts" => 123456789.12345,
780
861
  "ini_ttl" => 5_000,
781
862
  "rem_ttl" => 4_999
@@ -791,6 +872,7 @@ rql.lock_info("your_lock_name")
791
872
  {
792
873
  "lock_key" => "rql:lock:your_lock_name",
793
874
  "acq_id" => "rql:acq:123/456/567/678/374dd74324",
875
+ "hst_id" => "rql:acq:123/456/678/374dd74324",
794
876
  "ts" => 123456789.12345,
795
877
  "ini_ttl" => 5_000,
796
878
  "rem_ttl" => 4_999,
@@ -812,6 +894,7 @@ rql.lock_info("your_lock_name")
812
894
  {
813
895
  "lock_key" => "rql:lock:your_lock_name",
814
896
  "acq_id" => "rql:acq:123/456/567/678/374dd74324",
897
+ "hst_id" => "rql:acq:123/456/678/374dd74324",
815
898
  "ts" => 123456789.12345,
816
899
  "ini_ttl" => 5_000,
817
900
  "rem_ttl" => 9_444,
@@ -897,6 +980,7 @@ rql.queued?("your_lock_name") # => true/false
897
980
 
898
981
  - release the concrete lock with lock request queue;
899
982
  - queue will be relased first;
983
+ - has an alias: `#release_lock`;
900
984
  - accepts:
901
985
  - `lock_name` - (required) `[String]` - the lock name that should be released.
902
986
  - `:logger` - (optional) `[::Logger,#debug]`
@@ -917,6 +1001,10 @@ rql.queued?("your_lock_name") # => true/false
917
1001
  - `:log_sampler` - (optional) `[#sampling_happened?,Module<RedisQueuedLocks::Logging::Sampler>]`
918
1002
  - **log sampling**: percent-based log sampler that decides should be RQL case logged or not;
919
1003
  - pre-configured in `config[:log_sampler]`;
1004
+ - `log_sample_this` - (optional) `[Boolean]`
1005
+ - marks the method that everything should be logged despite the enabled log sampling;
1006
+ - makes sense when log sampling is enabled;
1007
+ - `false` by default;
920
1008
  - `:instr_sampling_enabled` - (optional) `[Boolean]`
921
1009
  - enables **instrumentaion sampling**;
922
1010
  - pre-configured in `config[:instr_sampling_enabled]`;
@@ -926,6 +1014,10 @@ rql.queued?("your_lock_name") # => true/false
926
1014
  - `instr_sampler` - (optional) `[#sampling_happened?,Module<RedisQueuedLocks::Instrument::Sampler>]`
927
1015
  - percent-based log sampler that decides should be RQL case instrumented or not;
928
1016
  - pre-configured in `config[:instr_sampler]`;
1017
+ - `instr_sample_this` - (optional) `[Boolean]`
1018
+ - marks the method that everything should be instrumneted despite the enabled instrumentation sampling;
1019
+ - makes sense when instrumentation sampling is enabled;
1020
+ - `false` by default;
929
1021
  - if you try to unlock non-existent lock you will receive `ok: true` result with operation timings
930
1022
  and `:nothing_to_release` result factor inside;
931
1023
 
@@ -964,6 +1056,7 @@ rql.unlock("your_lock_name")
964
1056
 
965
1057
  - release all obtained locks and related lock request queues;
966
1058
  - queues will be released first;
1059
+ - has an alias: `#release_locks`;
967
1060
  - accepts:
968
1061
  - `:batch_size` - (optional) `[Integer]`
969
1062
  - the size of batch of locks and lock queus that should be cleared under the one pipelined redis command at once;
@@ -985,6 +1078,10 @@ rql.unlock("your_lock_name")
985
1078
  - `:log_sampler` - (optional) `[#sampling_happened?,Module<RedisQueuedLocks::Logging::Sampler>]`
986
1079
  - **log sampling**: percent-based log sampler that decides should be RQL case logged or not;
987
1080
  - pre-configured in `config[:log_sampler]`;
1081
+ - `log_sample_this` - (optional) `[Boolean]`
1082
+ - marks the method that everything should be logged despite the enabled log sampling;
1083
+ - makes sense when log sampling is enabled;
1084
+ - `false` by default;
988
1085
  - `:instr_sampling_enabled` - (optional) `[Boolean]`
989
1086
  - enables **instrumentaion sampling**;
990
1087
  - pre-configured in `config[:instr_sampling_enabled]`;
@@ -994,6 +1091,10 @@ rql.unlock("your_lock_name")
994
1091
  - `instr_sampler` - (optional) `[#sampling_happened?,Module<RedisQueuedLocks::Instrument::Sampler>]`
995
1092
  - percent-based log sampler that decides should be RQL case instrumented or not;
996
1093
  - pre-configured in `config[:instr_sampler]`;
1094
+ - `instr_sample_this` - (optional) `[Boolean]`
1095
+ - marks the method that everything should be instrumneted despite the enabled instrumentation sampling;
1096
+ - makes sense when instrumentation sampling is enabled;
1097
+ - `false` by default;
997
1098
  - returns:
998
1099
  - `[Hash<Symbol,Numeric>]` - Format: `{ ok: true, result: Hash<Symbol,Numeric> }`;
999
1100
  - result data:
@@ -1044,6 +1145,10 @@ rql.clear_locks
1044
1145
  - `:log_sampler` - (optional) `[#sampling_happened?,Module<RedisQueuedLocks::Logging::Sampler>]`
1045
1146
  - **log sampling**: percent-based log sampler that decides should be RQL case logged or not;
1046
1147
  - pre-configured in `config[:log_sampler]`;
1148
+ - `log_sample_this` - (optional) `[Boolean]`
1149
+ - marks the method that everything should be logged despite the enabled log sampling;
1150
+ - makes sense when log sampling is enabled;
1151
+ - `false` by default;
1047
1152
  - `:instr_sampling_enabled` - (optional) `[Boolean]`
1048
1153
  - enables **instrumentaion sampling**;
1049
1154
  - pre-configured in `config[:instr_sampling_enabled]`;
@@ -1053,6 +1158,10 @@ rql.clear_locks
1053
1158
  - `instr_sampler` - (optional) `[#sampling_happened?,Module<RedisQueuedLocks::Instrument::Sampler>]`
1054
1159
  - percent-based log sampler that decides should be RQL case instrumented or not;
1055
1160
  - pre-configured in `config[:instr_sampler]`;
1161
+ - `instr_sample_this` - (optional) `[Boolean]`
1162
+ - marks the method that everything should be instrumneted despite the enabled instrumentation sampling;
1163
+ - makes sense when instrumentation sampling is enabled;
1164
+ - `false` by default;
1056
1165
  - returns `{ ok: true, result: :ttl_extended }` when ttl is extended;
1057
1166
  - returns `{ ok: false, result: :async_expire_or_no_lock }` when a lock not found or a lock is already expired during
1058
1167
  some steps of invocation (see **Important** section below);
@@ -1207,6 +1316,7 @@ rql.locks_info # or rql.locks_info(scan_size: 123)
1207
1316
  :status=>:alive,
1208
1317
  :info=>{
1209
1318
  "acq_id"=>"rql:acq:41478/4320/4340/4360/848818f09d8c3420",
1319
+ "hst_id"=>"rql:hst:41478/4320/4360/848818f09d8c3420"
1210
1320
  "ts"=>1711607112.670343,
1211
1321
  "ini_ttl"=>15000,
1212
1322
  "rem_ttl"=>13998}},
@@ -1290,6 +1400,10 @@ Accepts:
1290
1400
  - `:log_sampler` - (optional) `[#sampling_happened?,Module<RedisQueuedLocks::Logging::Sampler>]`
1291
1401
  - **log sampling**: percent-based log sampler that decides should be RQL case logged or not;
1292
1402
  - pre-configured in `config[:log_sampler]`;
1403
+ - `log_sample_this` - (optional) `[Boolean]`
1404
+ - marks the method that everything should be logged despite the enabled log sampling;
1405
+ - makes sense when log sampling is enabled;
1406
+ - `false` by default;
1293
1407
  - `:instr_sampling_enabled` - (optional) `[Boolean]`
1294
1408
  - enables **instrumentaion sampling**;
1295
1409
  - pre-configured in `config[:instr_sampling_enabled]`;
@@ -1299,6 +1413,10 @@ Accepts:
1299
1413
  - `instr_sampler` - (optional) `[#sampling_happened?,Module<RedisQueuedLocks::Instrument::Sampler>]`
1300
1414
  - percent-based log sampler that decides should be RQL case instrumented or not;
1301
1415
  - pre-configured in `config[:instr_sampler]`;
1416
+ - `instr_sample_this` - (optional) `[Boolean]`
1417
+ - marks the method that everything should be instrumneted despite the enabled instrumentation sampling;
1418
+ - makes sense when instrumentation sampling is enabled;
1419
+ - `false` by default;
1302
1420
 
1303
1421
  Returns: `{ ok: true, processed_queues: Set<String> }` returns the list of processed lock queues;
1304
1422
 
@@ -1319,8 +1437,318 @@ rql.clear_dead_requests(dead_ttl: 60 * 60 * 1000) # 1 hour in milliseconds
1319
1437
 
1320
1438
  ---
1321
1439
 
1440
+ #### #current_acquirer_id
1441
+
1442
+ <sup>\[[back to top](#usage)\]</sup>
1443
+
1444
+ - get the current acquirer identifier in RQL notation that you can use for debugging purposes during the lock analyzation;
1445
+ - acquirer identifier format:
1446
+ ```ruby
1447
+ "rql:acq:#{process_id}/#{thread_id}/#{fiber_id}/#{ractor_id}/#{identity}"
1448
+ ```
1449
+ - because of the moment that `#lock`/`#lock!` gives you a possibility to customize `process_id`,
1450
+ `fiber_id`, `thread_id`, `ractor_id` and `unique identity` identifiers the `#current_acquirer_id` method provides this possibility too;
1451
+
1452
+ Accepts:
1453
+
1454
+ - `process_id:` - (optional) `[Integer,Any]`
1455
+ - `::Process.pid` by default;
1456
+ - `thread_id:` - (optional) `[Integer,Any]`;
1457
+ - `::Thread.current.object_id` by default;
1458
+ - `fiber_id:` - (optional) `[Integer,Any]`;
1459
+ - `::Fiber.current.object_id` by default;
1460
+ - `ractor_id:` - (optional) `[Integer,Any]`;
1461
+ - `::Ractor.current.object_id` by default;
1462
+ - `identity:` - (optional) `[String,Any]`;
1463
+ - this value is calculated once during `RedisQueuedLock::Client` instantiation and stored in `@uniq_identity`;
1464
+ - this value can be accessed from `RedisQueuedLock::Client#uniq_identity`;
1465
+ - [Configuration](#configuration) documentation: see `config[:uniq_identifier]`;
1466
+ - [#lock](#lock---obtain-a-lock) method documentation: see `uniq_identifier`;
1467
+
1468
+ ```ruby
1469
+ rql.current_acquirer_id
1470
+
1471
+ # =>
1472
+ "rql:acq:38529/4500/4520/4360/66093702f24a3129"
1473
+ ```
1474
+
1475
+ ---
1476
+
1477
+ #### #current_host_id
1478
+
1479
+ <sup>\[[back to top](#usage)\]</sup>
1480
+
1481
+ - get a current host identifier in RQL notation that you can use for debugging purposes during the lock analyzis;
1482
+ - the host is a ruby worker (a combination of process/thread/ractor) that is alive and can obtain locks;
1483
+ - the host is limited to `process`/`thread`/`ractor` (without `fiber`) combination cuz we have no abilities to extract
1484
+ all fiber objects from the current ruby process when at least one ractor object is defined (**ObjectSpace** loses
1485
+ abilities to extract `Fiber` and `Thread` objects after the any ractor is created) (`Thread` objects are analyzed
1486
+ via `Thread.list` API which does not lose their abilites);
1487
+ - host identifier format:
1488
+ ```ruby
1489
+ "rql:hst:#{process_id}/#{thread_id}/#{ractor_id}/#{uniq_identity}"
1490
+ ```
1491
+ - because of the moment that `#lock`/`#lock!` gives you a possibility to customize `process_id`,
1492
+ `fiber_id`, `thread_id`, `ractor_id` and `unique identity` identifiers the `#current_host_id` method provides this possibility too
1493
+ (except the `fiber_id` correspondingly);
1494
+
1495
+ Accepts:
1496
+
1497
+ - `process_id:` - (optional) `[Integer,Any]`
1498
+ - `::Process.pid` by default;
1499
+ - `thread_id:` - (optional) `[Integer,Any]`;
1500
+ - `::Thread.current.object_id` by default;
1501
+ - `ractor_id:` - (optional) `[Integer,Any]`;
1502
+ - `::Ractor.current.object_id` by default;
1503
+ - `identity:` - (optional) `[String]`;
1504
+ - this value is calculated once during `RedisQueuedLock::Client` instantiation and stored in `@uniq_identity`;
1505
+ - this value can be accessed from `RedisQueuedLock::Client#uniq_identity`;
1506
+ - [Configuration](#configuration) documentation: see `config[:uniq_identifier]`;
1507
+ - [#lock](#lock---obtain-a-lock) method documentation: see `uniq_identifier`;
1508
+
1509
+ ```ruby
1510
+ rql.current_host_id
1511
+
1512
+ # =>
1513
+ "rql:acq:38529/4500/4360/66093702f24a3129"
1514
+ ```
1515
+
1516
+ ---
1517
+
1518
+ #### #possible_host_ids
1519
+
1520
+ <sup>\[[back to top](#usage)\]</sup>
1521
+
1522
+ - return the list (`Array<String>`) of possible host identifiers that can be reached from the current ractor;
1523
+ - the host is a ruby worker (a combination of process/thread/ractor/identity) that is alive and can obtain locks;
1524
+ - the host is limited to `process`/`thread`/`ractor` (without `fiber`) combination cuz we have no abilities to extract
1525
+ all fiber objects from the current ruby process when at least one ractor object is defined (**ObjectSpace** loses
1526
+ abilities to extract `Fiber` and `Thread` objects after the any ractor is created) (`Thread` objects are analyzed
1527
+ via `Thread.list` API which does not lose their abilites);
1528
+ - host identifier format:
1529
+ ```ruby
1530
+ "rql:hst:#{process_id}/#{thread_id}/#{ractor_id}/#{uniq_identity}"
1531
+ ```
1532
+
1533
+ Accepts:
1534
+
1535
+ - `identity` - (optional) `[String]`;
1536
+ - this value is calculated once during `RedisQueuedLock::Client` instantiation and stored in `@uniq_identity`;
1537
+ - this value can be accessed from `RedisQueuedLock::Client#uniq_identity`;
1538
+ - [Configuration](#configuration) documentation: see `config[:uniq_identifier]`;
1539
+ - [#lock](#lock---obtain-a-lock) method documentation: see `uniq_identifier`;
1540
+
1541
+ ```ruby
1542
+ rql.possible_host_ids
1543
+
1544
+ # =>
1545
+ [
1546
+ "rql:hst:18814/2300/2280/5ce0c4582fc59c06", # process id / thread id / ractor id / uniq identity
1547
+ "rql:hst:18814/2320/2280/5ce0c4582fc59c06", # ...
1548
+ "rql:hst:18814/2340/2280/5ce0c4582fc59c06", # ...
1549
+ "rql:hst:18814/2360/2280/5ce0c4582fc59c06", # ...
1550
+ "rql:hst:18814/2380/2280/5ce0c4582fc59c06", # ...
1551
+ "rql:hst:18814/2400/2280/5ce0c4582fc59c06"
1552
+ ]
1553
+ ```
1554
+ ---
1555
+
1556
+ ## Swarm Mode and Zombie Locks
1557
+
1558
+ <sup>\[[back to top](#table-of-contents)\]</sup>
1559
+
1560
+ > Eliminate zombie locks with a swarm.
1561
+
1562
+ **This documentation section is in progress!**
1563
+
1564
+ [(work and usage preview (temporary example-based docs))](#work-and-usage-preview-temporary-example-based-docs)
1565
+
1566
+ - [How to Swarm](#how-to-swarm)
1567
+ - [configuration](#)
1568
+ - [swarm_status](#swarm_status)
1569
+ - [swarm_info](#swarm_info)
1570
+ - [swarmize!](#swarmize!)
1571
+ - [deswarmize!](#deswarmize!)
1572
+ - [probe_hosts](#probe_hosts)
1573
+ - [flush_zobmies](#flush_zombies)
1574
+ - [zombies_info](#zombies_info)
1575
+ - [zombie_locks](#zombie_locks)
1576
+ - [zombie_hosts](#zombie_hosts)
1577
+ - [zombie_acquiers](#zombie_acquiers)
1578
+
1579
+ <hr>
1580
+
1581
+ #### Work and Usage Preview (temporary example-based docs)
1582
+
1583
+ <sup>\[[back to top](#swarm-mode-and-zombie-locks)\]</sup>
1584
+
1585
+ <details>
1586
+ <summary>configuration</summary>
1587
+
1588
+ ```ruby
1589
+ redis_client = RedisClient.config.new_pool # NOTE: provide your own RedisClient instance
1590
+
1591
+ clinet = RedisQueuedLocks::Client.new(redis_client) do |config|
1592
+ # NOTE: auto-swarm your RQL client after initalization (run swarm elements and their supervisor)
1593
+ config.swarm.auto_swarm = false
1594
+
1595
+ # supervisor configs
1596
+ config.swarm.supervisor.liveness_probing_period = 2 # NOTE: in seconds
1597
+
1598
+ # (probe_hosts) host probing configuration
1599
+ config.swarm.probe_hosts.enabled_for_swarm = true # NOTE: run host-probing from or not
1600
+ config.swarm.probe_hosts.probe_period = 2 # NOTE: (in seconds) the period of time when the probing process is triggered
1601
+ # (probe_hosts) individual redis config
1602
+ config.swarm.probe_hosts.redis_config.sentinel = false # NOTE: individual redis config
1603
+ config.swarm.probe_hosts.redis_config.pooled = false # NOTE: individual redis config
1604
+ config.swarm.probe_hosts.redis_config.config = {} # NOTE: individual redis config
1605
+ config.swarm.probe_hosts.redis_config.pool_config = {} # NOTE: individual redis config
1606
+
1607
+ # (flush_zombies) zombie flushing configuration
1608
+ config.swarm.flush_zombies.enabled_for_swarm = true # NOTE: run zombie flushing or not
1609
+ config.swarm.flush_zombies.zombie_flush_period = 10 # NOTE: (in seconds) period of time when the zombie flusher is triggered
1610
+ config.swarm.flush_zombies.zombie_ttl = 15_000 # NOTE: (in milliseconds) when the lock/host/acquier is considered a zombie
1611
+ config.swarm.flush_zombies.zombie_lock_scan_size = 500 # NOTE: scan sizec during zombie flushing
1612
+ config.swarm.flush_zombies.zombie_queue_scan_size = 500 # NOTE: scan sizec during zombie flushing
1613
+ # (flush_zombies) individual redis config
1614
+ config.swarm.flush_zombies.redis_config.sentinel = false # NOTE: individual redis config
1615
+ config.swarm.flush_zombies.redis_config.pooled = false # NOTE: individual redis config
1616
+ config.swarm.flush_zombies.redis_config.config = {} # NOTE: individual redis config
1617
+ config.swarm.flush_zombies.redis_config.pool_config = {} # NOTE: individual redis config
1618
+ ```
1619
+ </details>
1620
+
1621
+ <details>
1622
+ <summary>seed a zombie</summary>
1623
+
1624
+ - obtain some long living lock and kill the host process which will lead the lock becoming a zombie:
1625
+
1626
+ ```ruby
1627
+ daiver => ~/Projects/redis_queued_locks  master [$]
1628
+ ➜ bin/console
1629
+ [1] pry(main)> rql = RedisQueuedLocks::Client.new(RedisClient.new);
1630
+ [2] pry(main)> rql.swarmize!
1631
+ /Users/daiver/Projects/redis_queued_locks/lib/redis_queued_locks/swarm/flush_zombies.rb:107: warning: Ractor is experimental, and the behavior may change in future versions of Ruby! Also there are many implementation issues.
1632
+ => {:ok=>true, :result=>:swarming}
1633
+ [3] pry(main)> rql.lock('kekpek', ttl: 1111111111)
1634
+ => {:ok=>true,
1635
+ :result=>
1636
+ {:lock_key=>"rql:lock:kekpek",
1637
+ :acq_id=>"rql:acq:17580/2260/2380/2280/3f16b93973612580",
1638
+ :hst_id=>"rql:hst:17580/2260/2280/3f16b93973612580",
1639
+ :ts=>1720305351.069259,
1640
+ :ttl=>1111111111,
1641
+ :process=>:lock_obtaining}}
1642
+ [4] pry(main)> exit
1643
+ ```
1644
+ </details>
1645
+
1646
+ <details>
1647
+ <summary>find zombies</summary>
1648
+
1649
+ - start another process, fetch the swarm info, see that our last process is a zombie now and their hosted lock is a zombie too:
1650
+
1651
+ ```ruby
1652
+ daiver => ~/Projects/redis_queued_locks  master [$] took 27.2s
1653
+ ➜ bin/console
1654
+ [1] pry(main)> rql = RedisQueuedLocks::Client.new(RedisClient.new);
1655
+ [2] pry(main)> rql.swarm_info
1656
+ => {"rql:hst:17580/2260/2280/3f16b93973612580"=>{:zombie=>true, :last_probe_time=>2024-07-07 01:35:53 12897/262144 +0300, :last_probe_score=>1720305353.0491982},
1657
+ "rql:hst:17580/2300/2280/3f16b93973612580"=>{:zombie=>true, :last_probe_time=>2024-07-07 01:35:53 211107/4194304 +0300, :last_probe_score=>1720305353.0503318},
1658
+ "rql:hst:17580/2320/2280/3f16b93973612580"=>{:zombie=>true, :last_probe_time=>2024-07-07 01:35:53 106615/2097152 +0300, :last_probe_score=>1720305353.050838},
1659
+ "rql:hst:17580/2260/2340/3f16b93973612580"=>{:zombie=>true, :last_probe_time=>2024-07-07 01:35:53 26239/524288 +0300, :last_probe_score=>1720305353.050047},
1660
+ "rql:hst:17580/2300/2340/3f16b93973612580"=>{:zombie=>true, :last_probe_time=>2024-07-07 01:35:53 106359/2097152 +0300, :last_probe_score=>1720305353.050716},
1661
+ "rql:hst:17580/2320/2340/3f16b93973612580"=>{:zombie=>true, :last_probe_time=>2024-07-07 01:35:53 213633/4194304 +0300, :last_probe_score=>1720305353.050934},
1662
+ "rql:hst:17580/2360/2280/3f16b93973612580"=>{:zombie=>true, :last_probe_time=>2024-07-07 01:35:53 214077/4194304 +0300, :last_probe_score=>1720305353.05104},
1663
+ "rql:hst:17580/2360/2340/3f16b93973612580"=>{:zombie=>true, :last_probe_time=>2024-07-07 01:35:53 214505/4194304 +0300, :last_probe_score=>1720305353.051142},
1664
+ "rql:hst:17580/2400/2280/3f16b93973612580"=>{:zombie=>true, :last_probe_time=>2024-07-07 01:35:53 53729/1048576 +0300, :last_probe_score=>1720305353.05124},
1665
+ "rql:hst:17580/2400/2340/3f16b93973612580"=>{:zombie=>true, :last_probe_time=>2024-07-07 01:35:53 3365/65536 +0300, :last_probe_score=>1720305353.0513458}}
1666
+ [3] pry(main)> rql.swarm_status
1667
+ => {:auto_swarm=>false,
1668
+ :supervisor=>{:running=>false, :state=>"non_initialized", :observable=>"non_initialized"},
1669
+ :probe_hosts=>{:enabled=>true, :thread=>{:running=>false, :state=>"non_initialized"}, :main_loop=>{:running=>false, :state=>"non_initialized"}},
1670
+ :flush_zombies=>{:enabled=>true, :ractor=>{:running=>false, :state=>"non_initialized"}, :main_loop=>{:running=>false, :state=>"non_initialized"}}}
1671
+ [4] pry(main)> rql.zombies_info
1672
+ => {:zombie_hosts=>
1673
+ #<Set:
1674
+ {"rql:hst:17580/2260/2280/3f16b93973612580",
1675
+ "rql:hst:17580/2300/2280/3f16b93973612580",
1676
+ "rql:hst:17580/2320/2280/3f16b93973612580",
1677
+ "rql:hst:17580/2260/2340/3f16b93973612580",
1678
+ "rql:hst:17580/2300/2340/3f16b93973612580",
1679
+ "rql:hst:17580/2320/2340/3f16b93973612580",
1680
+ "rql:hst:17580/2360/2280/3f16b93973612580",
1681
+ "rql:hst:17580/2360/2340/3f16b93973612580",
1682
+ "rql:hst:17580/2400/2280/3f16b93973612580",
1683
+ "rql:hst:17580/2400/2340/3f16b93973612580"}>,
1684
+ :zombie_acquirers=>#<Set: {"rql:acq:17580/2260/2380/2280/3f16b93973612580"}>,
1685
+ :zombie_locks=>#<Set: {"rql:lock:kekpek"}>}
1686
+ [5] pry(main)> rql.zombie_locks
1687
+ => #<Set: {"rql:lock:kekpek"}>
1688
+ [6] pry(main)> rql.zombie_acquiers
1689
+ => #<Set: {"rql:acq:17580/2260/2380/2280/3f16b93973612580"}>
1690
+ [7] pry(main)> rql.zombie_hosts
1691
+ => #<Set:
1692
+ {"rql:hst:17580/2260/2280/3f16b93973612580",
1693
+ "rql:hst:17580/2300/2280/3f16b93973612580",
1694
+ "rql:hst:17580/2320/2280/3f16b93973612580",
1695
+ "rql:hst:17580/2260/2340/3f16b93973612580",
1696
+ "rql:hst:17580/2300/2340/3f16b93973612580",
1697
+ "rql:hst:17580/2320/2340/3f16b93973612580",
1698
+ "rql:hst:17580/2360/2280/3f16b93973612580",
1699
+ "rql:hst:17580/2360/2340/3f16b93973612580",
1700
+ "rql:hst:17580/2400/2280/3f16b93973612580",
1701
+ "rql:hst:17580/2400/2340/3f16b93973612580"}>
1702
+ ```
1703
+ </details>
1704
+
1705
+ <details>
1706
+ <summary>kill zombies in a background</summary>
1707
+
1708
+ - swarmize the new current ruby process that should run the flush zombies element that will drop zombie locks, zombie hosts and their lock requests in a background:
1709
+
1710
+ ```ruby
1711
+ [8] pry(main)> rql.swarmize!
1712
+ /Users/daiver/Projects/redis_queued_locks/lib/redis_queued_locks/swarm/flush_zombies.rb:107: warning: Ractor is experimental, and the behavior may change in future versions of Ruby! Also there are many implementation issues.
1713
+ => {:ok=>true, :result=>:swarming}
1714
+ [9] pry(main)> rql.swarm_info
1715
+ => {"rql:hst:17752/2260/2280/89beef198021f16d"=>{:zombie=>false, :last_probe_time=>2024-07-07 01:36:39 4012577/4194304 +0300, :last_probe_score=>1720305399.956673},
1716
+ "rql:hst:17752/2300/2280/89beef198021f16d"=>{:zombie=>false, :last_probe_time=>2024-07-07 01:36:39 4015233/4194304 +0300, :last_probe_score=>1720305399.9573061},
1717
+ "rql:hst:17752/2320/2280/89beef198021f16d"=>{:zombie=>false, :last_probe_time=>2024-07-07 01:36:39 4016755/4194304 +0300, :last_probe_score=>1720305399.957669},
1718
+ "rql:hst:17752/2260/2340/89beef198021f16d"=>{:zombie=>false, :last_probe_time=>2024-07-07 01:36:39 1003611/1048576 +0300, :last_probe_score=>1720305399.957118},
1719
+ "rql:hst:17752/2300/2340/89beef198021f16d"=>{:zombie=>false, :last_probe_time=>2024-07-07 01:36:39 2008027/2097152 +0300, :last_probe_score=>1720305399.957502},
1720
+ "rql:hst:17752/2320/2340/89beef198021f16d"=>{:zombie=>false, :last_probe_time=>2024-07-07 01:36:39 2008715/2097152 +0300, :last_probe_score=>1720305399.95783},
1721
+ "rql:hst:17752/2360/2280/89beef198021f16d"=>{:zombie=>false, :last_probe_time=>2024-07-07 01:36:39 4018063/4194304 +0300, :last_probe_score=>1720305399.9579809},
1722
+ "rql:hst:17752/2360/2340/89beef198021f16d"=>{:zombie=>false, :last_probe_time=>2024-07-07 01:36:39 1004673/1048576 +0300, :last_probe_score=>1720305399.9581308}}
1723
+ [10] pry(main)> rql.swarm_status
1724
+ => {:auto_swarm=>false,
1725
+ :supervisor=>{:running=>true, :state=>"sleep", :observable=>"initialized"},
1726
+ :probe_hosts=>{:enabled=>true, :thread=>{:running=>true, :state=>"sleep"}, :main_loop=>{:running=>true, :state=>"sleep"}},
1727
+ :flush_zombies=>{:enabled=>true, :ractor=>{:running=>true, :state=>"running"}, :main_loop=>{:running=>true, :state=>"sleep"}}}
1728
+ [11] pry(main)> rql.zombies_info
1729
+ => {:zombie_hosts=>#<Set: {}>, :zombie_acquirers=>#<Set: {}>, :zombie_locks=>#<Set: {}>}
1730
+ [12] pry(main)> rql.zombie_acquiers
1731
+ => #<Set: {}>
1732
+ [13] pry(main)> rql.zombie_hosts
1733
+ => #<Set: {}>
1734
+ [14] pry(main)>
1735
+ ```
1736
+ </details>
1737
+
1738
+ <details>
1739
+ <summary>swarm hosts key in Redis</summary>
1740
+
1741
+ ```ruby
1742
+ "rql:swarm:hsts"
1743
+ ```
1744
+ </details>
1745
+
1746
+ ---
1747
+
1322
1748
  ## Lock Access Strategies
1323
1749
 
1750
+ <sup>\[[back to top](#table-of-contents)\]</sup>
1751
+
1324
1752
  - **this documentation section is in progress**;
1325
1753
  - (little details for a context of the current implementation and feautres):
1326
1754
  - defines the way in which the lock should be obitained;
@@ -1360,34 +1788,34 @@ rql.clear_dead_requests(dead_ttl: 60 * 60 * 1000) # 1 hour in milliseconds
1360
1788
  - default logs (raised from `#lock`/`#lock!`):
1361
1789
 
1362
1790
  ```ruby
1363
- "[redis_queued_locks.start_lock_obtaining]" # (logs "lock_key", "queue_ttl", "acq_id", "acs_strat");
1364
- "[redis_queued_locks.start_try_to_lock_cycle]" # (logs "lock_key", "queue_ttl", "acq_id", "acs_strat");
1365
- "[redis_queued_locks.dead_score_reached__reset_acquier_position]" # (logs "lock_key", "queue_ttl", "acq_id", "acs_strat");
1366
- "[redis_queued_locks.lock_obtained]" # (logs "lock_key", "queue_ttl", "acq_id", "acq_time");
1367
- "[redis_queued_locks.extendable_reentrant_lock_obtained]" # (logs "lock_key", "queue_ttl", "acq_id", "acs_strat", "acq_time");
1368
- "[redis_queued_locks.reentrant_lock_obtained]" # (logs "lock_key", "queue_ttl", "acq_id", "acs_strat", "acq_time");
1369
- "[redis_queued_locks.fail_fast_or_limits_reached_or_deadlock__dequeue]" # (logs "lock_key", "queue_ttl", "acq_id", "acs_strat");
1370
- "[redis_queued_locks.expire_lock]" # (logs "lock_key", "queue_ttl", "acq_id", "acs_strat");
1371
- "[redis_queued_locks.decrease_lock]" # (logs "lock_key", "decreased_ttl", "queue_ttl", "acq_id", "acs_strat");
1791
+ "[redis_queued_locks.start_lock_obtaining]" # (logs "lock_key", "queue_ttl", "acq_id", "hst_id", "acs_strat");
1792
+ "[redis_queued_locks.start_try_to_lock_cycle]" # (logs "lock_key", "queue_ttl", "acq_id", "hst_id", "acs_strat");
1793
+ "[redis_queued_locks.dead_score_reached__reset_acquier_position]" # (logs "lock_key", "queue_ttl", "acq_id", "hst_id", "acs_strat");
1794
+ "[redis_queued_locks.lock_obtained]" # (logs "lock_key", "queue_ttl", "acq_id", "hst_id", "acq_time");
1795
+ "[redis_queued_locks.extendable_reentrant_lock_obtained]" # (logs "lock_key", "queue_ttl", "acq_id", "hst_id", "acs_strat", "acq_time");
1796
+ "[redis_queued_locks.reentrant_lock_obtained]" # (logs "lock_key", "queue_ttl", "acq_id", "hst_id", "acs_strat", "acq_time");
1797
+ "[redis_queued_locks.fail_fast_or_limits_reached_or_deadlock__dequeue]" # (logs "lock_key", "queue_ttl", "acq_id", "hst_id", "acs_strat");
1798
+ "[redis_queued_locks.expire_lock]" # (logs "lock_key", "queue_ttl", "acq_id", "hst_id", "acs_strat");
1799
+ "[redis_queued_locks.decrease_lock]" # (logs "lock_key", "decreased_ttl", "queue_ttl", "acq_id", "hst_id", "acs_strat");
1372
1800
  ```
1373
1801
 
1374
1802
  - additional logs (raised from `#lock`/`#lock!` with `confg[:log_lock_try] == true`):
1375
1803
 
1376
1804
  ```ruby
1377
- "[redis_queued_locks.try_lock.start]" # (logs "lock_key", "queue_ttl", "acq_id", "acs_strat");
1378
- "[redis_queued_locks.try_lock.rconn_fetched]" # (logs "lock_key", "queue_ttl", "acq_id", "acs_strat");
1379
- "[redis_queued_locks.try_lock.same_process_conflict_detected]" # (logs "lock_key", "queue_ttl", "acq_id", "acs_strat");
1380
- "[redis_queued_locks.try_lock.same_process_conflict_analyzed]" # (logs "lock_key", "queue_ttl", "acq_id", "acs_strat", "spc_status");
1381
- "[redis_queued_locks.try_lock.reentrant_lock__extend_and_work_through]" # (logs "lock_key", "queue_ttl", "acq_id", "acs_strat", "spc_status", "last_ext_ttl", "last_ext_ts");
1382
- "[redis_queued_locks.try_lock.reentrant_lock__work_through]" # (logs "lock_key", "queue_ttl", "acq_id", "acs_strat", "spc_status", last_spc_ts);
1383
- "[redis_queued_locks.try_lock.single_process_lock_conflict__dead_lock]" # (logs "lock_key", "queue_ttl", "acq_id", "acs_strat", "spc_status", "last_spc_ts");
1384
- "[redis_queued_locks.try_lock.acq_added_to_queue]" # (logs "lock_key", "queue_ttl", "acq_id", "acs_strat");
1385
- "[redis_queued_locks.try_lock.remove_expired_acqs]" # (logs "lock_key", "queue_ttl", "acq_id", "acs_strat");
1386
- "[redis_queued_locks.try_lock.get_first_from_queue]" # (logs "lock_key", "queue_ttl", "acq_id", "acs_strat", "first_acq_id_in_queue");
1387
- "[redis_queued_locks.try_lock.exit__queue_ttl_reached]" # (logs "lock_key", "queue_ttl", "acq_id", "acs_strat");
1388
- "[redis_queued_locks.try_lock.exit__no_first]" # (logs "lock_key", "queue_ttl", "acq_id", "acs_strat", "first_acq_id_in_queue", "<current_lock_data>");
1389
- "[redis_queued_locks.try_lock.exit__lock_still_obtained]" # (logs "lock_key", "queue_ttl", "acq_id", "acs_strat", "first_acq_id_in_queue", "locked_by_acq_id", "<current_lock_data>");
1390
- "[redis_queued_locks.try_lock.obtain__free_to_acquire]" # (logs "lock_key", "queue_ttl", "acq_id", "acs_strat");
1805
+ "[redis_queued_locks.try_lock.start]" # (logs "lock_key", "queue_ttl", "acq_id", "hst_id", "acs_strat");
1806
+ "[redis_queued_locks.try_lock.rconn_fetched]" # (logs "lock_key", "queue_ttl", "acq_id", "hst_id", "acs_strat");
1807
+ "[redis_queued_locks.try_lock.same_process_conflict_detected]" # (logs "lock_key", "queue_ttl", "acq_id", "hst_id", "acs_strat");
1808
+ "[redis_queued_locks.try_lock.same_process_conflict_analyzed]" # (logs "lock_key", "queue_ttl", "acq_id", "hst_id", "acs_strat", "spc_status");
1809
+ "[redis_queued_locks.try_lock.reentrant_lock__extend_and_work_through]" # (logs "lock_key", "queue_ttl", "acq_id", "hst_id", "acs_strat", "spc_status", "last_ext_ttl", "last_ext_ts");
1810
+ "[redis_queued_locks.try_lock.reentrant_lock__work_through]" # (logs "lock_key", "queue_ttl", "acq_id", "hst_id", "acs_strat", "spc_status", last_spc_ts);
1811
+ "[redis_queued_locks.try_lock.single_process_lock_conflict__dead_lock]" # (logs "lock_key", "queue_ttl", "acq_id", "hst_id", "acs_strat", "spc_status", "last_spc_ts");
1812
+ "[redis_queued_locks.try_lock.acq_added_to_queue]" # (logs "lock_key", "queue_ttl", "acq_id", "hst_id", "acs_strat");
1813
+ "[redis_queued_locks.try_lock.remove_expired_acqs]" # (logs "lock_key", "queue_ttl", "acq_id", "hst_id", "acs_strat");
1814
+ "[redis_queued_locks.try_lock.get_first_from_queue]" # (logs "lock_key", "queue_ttl", "acq_id", "hst_id", "acs_strat", "first_acq_id_in_queue");
1815
+ "[redis_queued_locks.try_lock.exit__queue_ttl_reached]" # (logs "lock_key", "queue_ttl", "acq_id", "hst_id", "acs_strat");
1816
+ "[redis_queued_locks.try_lock.exit__no_first]" # (logs "lock_key", "queue_ttl", "acq_id", "hst_id", "acs_strat", "first_acq_id_in_queue", "<current_lock_data>");
1817
+ "[redis_queued_locks.try_lock.exit__lock_still_obtained]" # (logs "lock_key", "queue_ttl", "acq_id", "hst_id", "acs_strat", "first_acq_id_in_queue", "locked_by_acq_id", "<current_lock_data>");
1818
+ "[redis_queued_locks.try_lock.obtain__free_to_acquire]" # (logs "lock_key", "queue_ttl", "acq_id", "hst_id", "acs_strat");
1391
1819
  ```
1392
1820
 
1393
1821
  ---
@@ -1437,6 +1865,7 @@ Detalized event semantics and payload structure:
1437
1865
  - payload:
1438
1866
  - `:ttl` - `integer`/`milliseconds` - lock ttl;
1439
1867
  - `:acq_id` - `string` - lock acquier identifier;
1868
+ - `:hst_id` - `string` - lock's host identifier;
1440
1869
  - `:lock_key` - `string` - lock name;
1441
1870
  - `:ts` - `numeric`/`epoch` - the time when the lock was obtaiend;
1442
1871
  - `:acq_time` - `float`/`milliseconds` - time spent on lock acquiring;
@@ -1449,6 +1878,7 @@ Detalized event semantics and payload structure:
1449
1878
  - `:lock_key` - `string` - lock name;
1450
1879
  - `:ttl` - `integer`/`milliseconds` - last lock ttl by reentrant locking;
1451
1880
  - `:acq_id` - `string` - lock acquier identifier;
1881
+ - `:hst_id` - `string` - lock's host identifier;
1452
1882
  - `:ts` - `numeric`/`epoch` - the time when the lock was obtaiend as extendable reentrant lock;
1453
1883
  - `:acq_time` - `float`/`milliseconds` - time spent on lock acquiring;
1454
1884
  - `:instrument` - `nil`/`Any` - custom data passed to the `#lock`/`#lock!` method as `:instrument` attribute;
@@ -1460,6 +1890,7 @@ Detalized event semantics and payload structure:
1460
1890
  - `:lock_key` - `string` - lock name;
1461
1891
  - `:ttl` - `integer`/`milliseconds` - last lock ttl by reentrant locking;
1462
1892
  - `:acq_id` - `string` - lock acquier identifier;
1893
+ - `:hst_id` - `string` - lock's host identifier;
1463
1894
  - `:ts` - `numeric`/`epoch` - the time when the lock was obtaiend as reentrant lock;
1464
1895
  - `:acq_time` - `float`/`milliseconds` - time spent on lock acquiring;
1465
1896
  - `:instrument` - `nil`/`Any` - custom data passed to the `#lock`/`#lock!` method as `:instrument` attribute;
@@ -1471,6 +1902,7 @@ Detalized event semantics and payload structure:
1471
1902
  - `:hold_time` - `float`/`milliseconds` - lock hold time;
1472
1903
  - `:ttl` - `integer`/`milliseconds` - lock ttl;
1473
1904
  - `:acq_id` - `string` - lock acquier identifier;
1905
+ - `:hst_id` - `string` - lock's host identifier;
1474
1906
  - `:lock_key` - `string` - lock name;
1475
1907
  - `:ts` - `numeric`/`epoch` - the time when lock was obtained;
1476
1908
  - `:acq_time` - `float`/`milliseconds` - time spent on lock acquiring;
@@ -1484,6 +1916,7 @@ Detalized event semantics and payload structure:
1484
1916
  - `:hold_time` - `float`/`milliseconds` - lock hold time;
1485
1917
  - `:ttl` - `integer`/`milliseconds` - last lock ttl by reentrant locking;
1486
1918
  - `:acq_id` - `string` - lock acquier identifier;
1919
+ - `:hst_id` - `string` - lock's host identifier;
1487
1920
  - `:ts` - `numeric`/`epoch` - the time when the lock was obtaiend as reentrant lock;
1488
1921
  - `:lock_key` - `string` - lock name;
1489
1922
  - `:acq_time` - `float`/`milliseconds` - time spent on lock acquiring;
@@ -1513,6 +1946,12 @@ Detalized event semantics and payload structure:
1513
1946
  <sup>\[[back to top](#table-of-contents)\]</sup>
1514
1947
 
1515
1948
  - **Major**:
1949
+ - Swarm:
1950
+ - circuit-breaker for long-living failures of your infrastructure inside the swarm elements and supervisor:
1951
+ the supervisor will stop (for some period of time or while the some factor will return `true`)
1952
+ trying to ressurect unexpectedly terminated swarm elements, and will notify about this;
1953
+ - swarm logs (thread/ractor has some limitations so the initial implementation does not include swarm logging);
1954
+ - swarm instrumentation (thread/ractor has some limitations so the initial implementation does not include swarm instrumentation);
1516
1955
  - lock request prioritization;
1517
1956
  - **strict redlock algorithm support** (support for many `RedisClient` instances);
1518
1957
  - `#lock_series` - acquire a series of locks: