redis-message-queue 8.0.2__tar.gz → 8.0.3__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (23) hide show
  1. {redis_message_queue-8.0.2 → redis_message_queue-8.0.3}/PKG-INFO +100 -4
  2. {redis_message_queue-8.0.2 → redis_message_queue-8.0.3}/README.md +99 -3
  3. {redis_message_queue-8.0.2 → redis_message_queue-8.0.3}/pyproject.toml +1 -1
  4. {redis_message_queue-8.0.2 → redis_message_queue-8.0.3}/redis_message_queue/_config.py +3 -0
  5. {redis_message_queue-8.0.2 → redis_message_queue-8.0.3}/redis_message_queue/_queue_key_manager.py +8 -0
  6. {redis_message_queue-8.0.2 → redis_message_queue-8.0.3}/redis_message_queue/_redis_cluster.py +25 -0
  7. {redis_message_queue-8.0.2 → redis_message_queue-8.0.3}/redis_message_queue/asyncio/redis_message_queue.py +99 -11
  8. {redis_message_queue-8.0.2 → redis_message_queue-8.0.3}/redis_message_queue/interrupt_handler/_implementation.py +6 -0
  9. {redis_message_queue-8.0.2 → redis_message_queue-8.0.3}/redis_message_queue/redis_message_queue.py +81 -6
  10. {redis_message_queue-8.0.2 → redis_message_queue-8.0.3}/LICENSE +0 -0
  11. {redis_message_queue-8.0.2 → redis_message_queue-8.0.3}/redis_message_queue/__init__.py +0 -0
  12. {redis_message_queue-8.0.2 → redis_message_queue-8.0.3}/redis_message_queue/_abstract_redis_gateway.py +0 -0
  13. {redis_message_queue-8.0.2 → redis_message_queue-8.0.3}/redis_message_queue/_callable_utils.py +0 -0
  14. {redis_message_queue-8.0.2 → redis_message_queue-8.0.3}/redis_message_queue/_event.py +0 -0
  15. {redis_message_queue-8.0.2 → redis_message_queue-8.0.3}/redis_message_queue/_exceptions.py +0 -0
  16. {redis_message_queue-8.0.2 → redis_message_queue-8.0.3}/redis_message_queue/_redis_gateway.py +0 -0
  17. {redis_message_queue-8.0.2 → redis_message_queue-8.0.3}/redis_message_queue/_stored_message.py +0 -0
  18. {redis_message_queue-8.0.2 → redis_message_queue-8.0.3}/redis_message_queue/asyncio/__init__.py +0 -0
  19. {redis_message_queue-8.0.2 → redis_message_queue-8.0.3}/redis_message_queue/asyncio/_abstract_redis_gateway.py +0 -0
  20. {redis_message_queue-8.0.2 → redis_message_queue-8.0.3}/redis_message_queue/asyncio/_redis_gateway.py +0 -0
  21. {redis_message_queue-8.0.2 → redis_message_queue-8.0.3}/redis_message_queue/interrupt_handler/__init__.py +0 -0
  22. {redis_message_queue-8.0.2 → redis_message_queue-8.0.3}/redis_message_queue/interrupt_handler/_interface.py +0 -0
  23. {redis_message_queue-8.0.2 → redis_message_queue-8.0.3}/redis_message_queue/py.typed +0 -0
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: redis-message-queue
3
- Version: 8.0.2
3
+ Version: 8.0.3
4
4
  Summary: Python message queuing with Redis and message deduplication
5
5
  License: MIT
6
6
  License-File: LICENSE
@@ -68,6 +68,13 @@ with queue.process_message() as message:
68
68
  `RedisMessageQueue` itself is not a context manager. Use
69
69
  `with queue.process_message() as message:` for each message.
70
70
 
71
+ > **Important:** In the sync API, work inside `process_message()` must be
72
+ > synchronous. If your handler is `async def`, returns a coroutine, or returns
73
+ > any other awaitable, use `redis_message_queue.asyncio.RedisMessageQueue`.
74
+ > The sync context manager does not inspect the handler's return value; an
75
+ > unawaited coroutine can be dropped while the message is acked. An ergonomic
76
+ > callback API that detects this is planned for v8.1.
77
+
71
78
  ### Async quickstart
72
79
 
73
80
  ```python
@@ -119,6 +126,12 @@ All features are optional and can be enabled or disabled as needed.
119
126
 
120
127
  See [Crash recovery with visibility timeout](#crash-recovery-with-visibility-timeout) for details and tradeoffs.
121
128
 
129
+ > **Important:** Handler exceptions are terminal. This library is a payload
130
+ > queue, not a task framework: raising inside `process_message()` does not
131
+ > requeue the message. With `enable_failed_queue=False`, the message is removed
132
+ > from `processing`; with `enable_failed_queue=True`, it is moved to the failed
133
+ > list.
134
+
122
135
  ## Configuration
123
136
 
124
137
  ### Deduplication
@@ -159,6 +172,13 @@ Avoid fallback patterns such as `lambda msg: msg.get("order_id", "")`.
159
172
  Missing fields should fail loudly instead of collapsing unrelated messages into
160
173
  one deduplication key.
161
174
 
175
+ Deduplication markers and publish retry-safety markers are Redis TTL keys. A
176
+ large forward step in the Redis server expiration clock during an in-call retry
177
+ window can expire those markers before the Python-side monotonic retry budget
178
+ elapses, allowing a duplicate publish. This is an extreme anomaly, mainly
179
+ relevant under cluster-wide NTP step corrections while a producer is retrying
180
+ after an ambiguous Redis write.
181
+
162
182
  ### Success and failure tracking
163
183
 
164
184
  ```python
@@ -234,6 +254,11 @@ queue = RedisMessageQueue(
234
254
  )
235
255
  ```
236
256
 
257
+ > **Important:** `visibility_timeout_seconds` is a lease, not a handler runtime
258
+ > cap. rmq never interrupts a long-running handler. If the lease expires while
259
+ > the handler continues, another consumer can reclaim and process the same
260
+ > message concurrently.
261
+
237
262
  This enables lease-based redelivery for messages left in `processing` by a crashed worker and renews the lease while a healthy long-running handler is still working.
238
263
  Tradeoffs:
239
264
  - delivery becomes at-least-once after lease expiry
@@ -258,6 +283,13 @@ The callback is **advisory** — it may fire briefly after a successful `process
258
283
 
259
284
  Without a visibility timeout, messages already moved to `processing` remain there indefinitely after a consumer crash and are not redelivered, even if the crash happened before your handler started running.
260
285
 
286
+ Visibility deadlines use Redis server time (`TIME`), not Python process time.
287
+ A forward step in the Redis server clock can make a live lease appear expired
288
+ and allow premature redelivery while the original consumer is still processing;
289
+ a backward step can delay reclaim of truly abandoned messages. Treat NTP step
290
+ corrections on Redis hosts as a deployment risk. Prefer time-synchronization
291
+ discipline that slews corrections rather than stepping the Redis clock.
292
+
261
293
  ### Ordering and multi-consumer fairness
262
294
 
263
295
  The built-in queue is a shared-pull Redis list. Successful publishes push to the
@@ -354,6 +386,11 @@ while not interrupt.is_interrupted():
354
386
  > `ValueError`. A repeated owned signal falls back to the default behavior
355
387
  > (for example, a second Ctrl+C raises `KeyboardInterrupt`). If you need multiple
356
388
  > shutdown hooks, use a single handler and fan out in your own code.
389
+ >
390
+ > Process-global signal ownership cannot be safely chained with task-worker
391
+ > CLIs such as Celery, RQ, or Dramatiq. Run sibling workers in separate
392
+ > processes, or install one top-level signal owner that calls `queue.drain()`
393
+ > / `queue.aclose()` or sets an application stop event.
357
394
 
358
395
  There are three distinct shutdown shapes; pick the one that matches your runtime:
359
396
 
@@ -383,7 +420,10 @@ if a publish is already inside the queue instance's publish path, drain waits
383
420
  for that publish to finish before it returns; publishes that arrive after the
384
421
  drained flag is set are rejected. The drained state is local to that Python
385
422
  queue object and is not written to Redis, so constructing a fresh
386
- `RedisMessageQueue(...)` over the same keys remains usable.
423
+ `RedisMessageQueue(...)` over the same keys remains usable. A separate process
424
+ or separate queue instance against the same Redis keys is not marked drained by
425
+ this call. For multi-process graceful shutdown, each process must drain its own
426
+ queue instances.
387
427
 
388
428
  Drain does not cancel in-flight handlers — the caller must arrange handler
389
429
  exit through normal thread/task coordination. Returns `True` if all in-memory
@@ -391,6 +431,24 @@ pending claim IDs were recovered within the timeout; `False` if the deadline
391
431
  fired or transient Redis errors left claim IDs pending (call again to retry).
392
432
  `timeout=0` reports current state without attempting recovery.
393
433
 
434
+ #### Abandoned in-flight messages
435
+
436
+ Abandoned in-flight messages are recovered lazily. Async tasks cancelled
437
+ without `aclose()`, or sync processes killed mid-handler, can leave the message
438
+ and its processing/lease metadata in Redis until a later consumer claim path
439
+ triggers visibility-timeout reclaim. With visibility timeouts enabled, this is
440
+ the designed at-least-once recovery path: the message is delayed by the lease,
441
+ not lost. With `visibility_timeout_seconds=None`, there is no automatic reclaim
442
+ path. For low-visibility-timeout workloads, prefer an explicit `drain()` /
443
+ `aclose()` during shutdown so local pending claim IDs are recovered before
444
+ process exit.
445
+
446
+ `drain()` / `aclose()` timeouts are measured with Python monotonic clocks, but
447
+ any lease deadlines they recover were created from Redis server time. The same
448
+ Redis-clock step caveats from
449
+ [Crash recovery with visibility timeout](#crash-recovery-with-visibility-timeout)
450
+ apply to when abandoned work becomes reclaimable.
451
+
394
452
  > **Heartbeat caveat (best-effort stop):** when `heartbeat_interval_seconds` is
395
453
  > set, the heartbeat sidecar's `stop()` is bounded but not strictly quiescent —
396
454
  > a slow renewal in flight when `process_message` exits may still write to
@@ -495,6 +553,42 @@ await client.aclose()
495
553
  For the sync Redis client, call `client.close()` during application shutdown when
496
554
  you own the client lifecycle.
497
555
 
556
+ ## Migrating from RQ / Celery / Dramatiq / taskiq
557
+
558
+ redis-message-queue is a payload queue, not a task framework. It has no task
559
+ registry, job object, result backend, scheduler, workflow canvas, callback
560
+ graph, or handler-level retry policy. Producers publish a `str` or `dict`
561
+ payload, and consumers decide what that payload means.
562
+
563
+ The most important semantic differences from sibling task libraries are:
564
+
565
+ - Handler exceptions are terminal. Raising inside `process_message()` removes
566
+ the message from `processing`, or moves it to the failed list when
567
+ `enable_failed_queue=True`; it does not requeue or retry the message.
568
+ - `visibility_timeout_seconds` is a crash/stall recovery lease, not a runtime
569
+ limit. Slow handlers are not interrupted; after the lease expires another
570
+ consumer can process the same payload concurrently.
571
+ - `on_event` is telemetry only. Callback exceptions are logged and emitted as
572
+ `RuntimeWarning`, but they do not affect ack/nack, failed-queue movement, or
573
+ any other message outcome. Do not use `on_event` for sagas, follow-up writes,
574
+ billing callbacks, or other correctness-critical work.
575
+ - Dict payloads are JSON data, not Python call arguments. JSON does not
576
+ preserve every Python type: tuples become lists, and sets or custom objects
577
+ raise unless you encode them into JSON-native values first.
578
+ - Process-global signal ownership cannot be safely chained with Celery, RQ, or
579
+ Dramatiq CLI workers. Prefer one top-level owner that calls `queue.drain()`
580
+ or sets an application stop event, and run sibling workers in separate
581
+ processes.
582
+
583
+ When migrating on the same Redis deployment, prefer separate Redis DBs or hard
584
+ namespaces. Do not point a Celery, RQ, Dramatiq, or taskiq worker at an rmq
585
+ pending key. A sibling worker can pop the rmq stored message, fail its own
586
+ decoder, and leave the rmq queue without that message. Also avoid custom
587
+ `key_separator` values that synthesize another library's key namespace, such as
588
+ using `":queue:"` with a queue name that overlaps RQ keys. rmq has no fixed
589
+ library prefix; generated keys share the Redis DB namespace with every other
590
+ Redis user.
591
+
498
592
  ## Production notes
499
593
 
500
594
  ### Fork safety and pre-fork servers
@@ -610,8 +704,10 @@ Events cover publish, dedup hits, claim/empty polls, reclaim, ack/nack,
610
704
  completed/failed cleanup, DLQ moves, heartbeat renewal, stale leases, cleanup
611
705
  and trim failures, and retry attempts. Callback exceptions are logged and
612
706
  reported with `RuntimeWarning`, but never propagate into queue operations.
613
- Package logs remain diagnostic; use `on_event` rather than log parsing for
614
- metrics.
707
+ `on_event` is telemetry only: use it for metrics, tracing, and logging, not for
708
+ sagas, follow-up writes, billing callbacks, or other correctness-critical
709
+ work. Package logs remain diagnostic; use `on_event` rather than log parsing
710
+ for metrics.
615
711
 
616
712
  ```python
617
713
  from opentelemetry import trace
@@ -42,6 +42,13 @@ with queue.process_message() as message:
42
42
  `RedisMessageQueue` itself is not a context manager. Use
43
43
  `with queue.process_message() as message:` for each message.
44
44
 
45
+ > **Important:** In the sync API, work inside `process_message()` must be
46
+ > synchronous. If your handler is `async def`, returns a coroutine, or returns
47
+ > any other awaitable, use `redis_message_queue.asyncio.RedisMessageQueue`.
48
+ > The sync context manager does not inspect the handler's return value; an
49
+ > unawaited coroutine can be dropped while the message is acked. An ergonomic
50
+ > callback API that detects this is planned for v8.1.
51
+
45
52
  ### Async quickstart
46
53
 
47
54
  ```python
@@ -93,6 +100,12 @@ All features are optional and can be enabled or disabled as needed.
93
100
 
94
101
  See [Crash recovery with visibility timeout](#crash-recovery-with-visibility-timeout) for details and tradeoffs.
95
102
 
103
+ > **Important:** Handler exceptions are terminal. This library is a payload
104
+ > queue, not a task framework: raising inside `process_message()` does not
105
+ > requeue the message. With `enable_failed_queue=False`, the message is removed
106
+ > from `processing`; with `enable_failed_queue=True`, it is moved to the failed
107
+ > list.
108
+
96
109
  ## Configuration
97
110
 
98
111
  ### Deduplication
@@ -133,6 +146,13 @@ Avoid fallback patterns such as `lambda msg: msg.get("order_id", "")`.
133
146
  Missing fields should fail loudly instead of collapsing unrelated messages into
134
147
  one deduplication key.
135
148
 
149
+ Deduplication markers and publish retry-safety markers are Redis TTL keys. A
150
+ large forward step in the Redis server expiration clock during an in-call retry
151
+ window can expire those markers before the Python-side monotonic retry budget
152
+ elapses, allowing a duplicate publish. This is an extreme anomaly, mainly
153
+ relevant under cluster-wide NTP step corrections while a producer is retrying
154
+ after an ambiguous Redis write.
155
+
136
156
  ### Success and failure tracking
137
157
 
138
158
  ```python
@@ -208,6 +228,11 @@ queue = RedisMessageQueue(
208
228
  )
209
229
  ```
210
230
 
231
+ > **Important:** `visibility_timeout_seconds` is a lease, not a handler runtime
232
+ > cap. rmq never interrupts a long-running handler. If the lease expires while
233
+ > the handler continues, another consumer can reclaim and process the same
234
+ > message concurrently.
235
+
211
236
  This enables lease-based redelivery for messages left in `processing` by a crashed worker and renews the lease while a healthy long-running handler is still working.
212
237
  Tradeoffs:
213
238
  - delivery becomes at-least-once after lease expiry
@@ -232,6 +257,13 @@ The callback is **advisory** — it may fire briefly after a successful `process
232
257
 
233
258
  Without a visibility timeout, messages already moved to `processing` remain there indefinitely after a consumer crash and are not redelivered, even if the crash happened before your handler started running.
234
259
 
260
+ Visibility deadlines use Redis server time (`TIME`), not Python process time.
261
+ A forward step in the Redis server clock can make a live lease appear expired
262
+ and allow premature redelivery while the original consumer is still processing;
263
+ a backward step can delay reclaim of truly abandoned messages. Treat NTP step
264
+ corrections on Redis hosts as a deployment risk. Prefer time-synchronization
265
+ discipline that slews corrections rather than stepping the Redis clock.
266
+
235
267
  ### Ordering and multi-consumer fairness
236
268
 
237
269
  The built-in queue is a shared-pull Redis list. Successful publishes push to the
@@ -328,6 +360,11 @@ while not interrupt.is_interrupted():
328
360
  > `ValueError`. A repeated owned signal falls back to the default behavior
329
361
  > (for example, a second Ctrl+C raises `KeyboardInterrupt`). If you need multiple
330
362
  > shutdown hooks, use a single handler and fan out in your own code.
363
+ >
364
+ > Process-global signal ownership cannot be safely chained with task-worker
365
+ > CLIs such as Celery, RQ, or Dramatiq. Run sibling workers in separate
366
+ > processes, or install one top-level signal owner that calls `queue.drain()`
367
+ > / `queue.aclose()` or sets an application stop event.
331
368
 
332
369
  There are three distinct shutdown shapes; pick the one that matches your runtime:
333
370
 
@@ -357,7 +394,10 @@ if a publish is already inside the queue instance's publish path, drain waits
357
394
  for that publish to finish before it returns; publishes that arrive after the
358
395
  drained flag is set are rejected. The drained state is local to that Python
359
396
  queue object and is not written to Redis, so constructing a fresh
360
- `RedisMessageQueue(...)` over the same keys remains usable.
397
+ `RedisMessageQueue(...)` over the same keys remains usable. A separate process
398
+ or separate queue instance against the same Redis keys is not marked drained by
399
+ this call. For multi-process graceful shutdown, each process must drain its own
400
+ queue instances.
361
401
 
362
402
  Drain does not cancel in-flight handlers — the caller must arrange handler
363
403
  exit through normal thread/task coordination. Returns `True` if all in-memory
@@ -365,6 +405,24 @@ pending claim IDs were recovered within the timeout; `False` if the deadline
365
405
  fired or transient Redis errors left claim IDs pending (call again to retry).
366
406
  `timeout=0` reports current state without attempting recovery.
367
407
 
408
+ #### Abandoned in-flight messages
409
+
410
+ Abandoned in-flight messages are recovered lazily. Async tasks cancelled
411
+ without `aclose()`, or sync processes killed mid-handler, can leave the message
412
+ and its processing/lease metadata in Redis until a later consumer claim path
413
+ triggers visibility-timeout reclaim. With visibility timeouts enabled, this is
414
+ the designed at-least-once recovery path: the message is delayed by the lease,
415
+ not lost. With `visibility_timeout_seconds=None`, there is no automatic reclaim
416
+ path. For low-visibility-timeout workloads, prefer an explicit `drain()` /
417
+ `aclose()` during shutdown so local pending claim IDs are recovered before
418
+ process exit.
419
+
420
+ `drain()` / `aclose()` timeouts are measured with Python monotonic clocks, but
421
+ any lease deadlines they recover were created from Redis server time. The same
422
+ Redis-clock step caveats from
423
+ [Crash recovery with visibility timeout](#crash-recovery-with-visibility-timeout)
424
+ apply to when abandoned work becomes reclaimable.
425
+
368
426
  > **Heartbeat caveat (best-effort stop):** when `heartbeat_interval_seconds` is
369
427
  > set, the heartbeat sidecar's `stop()` is bounded but not strictly quiescent —
370
428
  > a slow renewal in flight when `process_message` exits may still write to
@@ -469,6 +527,42 @@ await client.aclose()
469
527
  For the sync Redis client, call `client.close()` during application shutdown when
470
528
  you own the client lifecycle.
471
529
 
530
+ ## Migrating from RQ / Celery / Dramatiq / taskiq
531
+
532
+ redis-message-queue is a payload queue, not a task framework. It has no task
533
+ registry, job object, result backend, scheduler, workflow canvas, callback
534
+ graph, or handler-level retry policy. Producers publish a `str` or `dict`
535
+ payload, and consumers decide what that payload means.
536
+
537
+ The most important semantic differences from sibling task libraries are:
538
+
539
+ - Handler exceptions are terminal. Raising inside `process_message()` removes
540
+ the message from `processing`, or moves it to the failed list when
541
+ `enable_failed_queue=True`; it does not requeue or retry the message.
542
+ - `visibility_timeout_seconds` is a crash/stall recovery lease, not a runtime
543
+ limit. Slow handlers are not interrupted; after the lease expires another
544
+ consumer can process the same payload concurrently.
545
+ - `on_event` is telemetry only. Callback exceptions are logged and emitted as
546
+ `RuntimeWarning`, but they do not affect ack/nack, failed-queue movement, or
547
+ any other message outcome. Do not use `on_event` for sagas, follow-up writes,
548
+ billing callbacks, or other correctness-critical work.
549
+ - Dict payloads are JSON data, not Python call arguments. JSON does not
550
+ preserve every Python type: tuples become lists, and sets or custom objects
551
+ raise unless you encode them into JSON-native values first.
552
+ - Process-global signal ownership cannot be safely chained with Celery, RQ, or
553
+ Dramatiq CLI workers. Prefer one top-level owner that calls `queue.drain()`
554
+ or sets an application stop event, and run sibling workers in separate
555
+ processes.
556
+
557
+ When migrating on the same Redis deployment, prefer separate Redis DBs or hard
558
+ namespaces. Do not point a Celery, RQ, Dramatiq, or taskiq worker at an rmq
559
+ pending key. A sibling worker can pop the rmq stored message, fail its own
560
+ decoder, and leave the rmq queue without that message. Also avoid custom
561
+ `key_separator` values that synthesize another library's key namespace, such as
562
+ using `":queue:"` with a queue name that overlaps RQ keys. rmq has no fixed
563
+ library prefix; generated keys share the Redis DB namespace with every other
564
+ Redis user.
565
+
472
566
  ## Production notes
473
567
 
474
568
  ### Fork safety and pre-fork servers
@@ -584,8 +678,10 @@ Events cover publish, dedup hits, claim/empty polls, reclaim, ack/nack,
584
678
  completed/failed cleanup, DLQ moves, heartbeat renewal, stale leases, cleanup
585
679
  and trim failures, and retry attempts. Callback exceptions are logged and
586
680
  reported with `RuntimeWarning`, but never propagate into queue operations.
587
- Package logs remain diagnostic; use `on_event` rather than log parsing for
588
- metrics.
681
+ `on_event` is telemetry only: use it for metrics, tracing, and logging, not for
682
+ sagas, follow-up writes, billing callbacks, or other correctness-critical
683
+ work. Package logs remain diagnostic; use `on_event` rather than log parsing
684
+ for metrics.
589
685
 
590
686
  ```python
591
687
  from opentelemetry import trace
@@ -1,6 +1,6 @@
1
1
  [tool.poetry]
2
2
  name = "redis-message-queue"
3
- version = "8.0.2"
3
+ version = "8.0.3"
4
4
  description = "Python message queuing with Redis and message deduplication"
5
5
  authors = ["Elijas <4084885+Elijas@users.noreply.github.com>"]
6
6
  readme = "README.md"
@@ -50,6 +50,9 @@ def is_redis_retryable_exception(exception):
50
50
  ),
51
51
  )
52
52
 
53
+ if isinstance(exception, redis.exceptions.ClusterError) and "TTL exhausted" in str(exception):
54
+ return True
55
+
53
56
  # 2. Explicit retryable exceptions (BusyLoadingError is a ConnectionError
54
57
  # subclass, so it is already handled by branch 1 above)
55
58
  return isinstance(
@@ -17,6 +17,14 @@ def validate_callable_deduplication_key(dedup_key: object, message: str | dict)
17
17
 
18
18
 
19
19
  class QueueKeyManager:
20
+ """Build Redis keys for one rmq queue namespace.
21
+
22
+ ``key_separator`` is part of every generated key and rmq has no fixed
23
+ library prefix. Do not choose a separator that overlaps another Redis task
24
+ library's namespace, such as ``":queue:"`` with RQ-style keys; user-chosen
25
+ separators interact with every Redis user on the same DB.
26
+ """
27
+
20
28
  # Logs message existence to prevent duplication.
21
29
  # Messages are marked for the duration of their lifecycle.
22
30
  _MESSAGE_DEDUPLICATION_LOG = "deduplication"
@@ -1,4 +1,5 @@
1
1
  import re
2
+ from collections.abc import Mapping
2
3
 
3
4
  from redis.crc import key_slot
4
5
 
@@ -6,12 +7,36 @@ from redis_message_queue._exceptions import ConfigurationError
6
7
  from redis_message_queue._queue_key_manager import QueueKeyManager
7
8
 
8
9
  _HASH_TAG_PATTERN = re.compile(r"\{([^{}]+)\}")
10
+ PLAIN_REDIS_CLUSTER_CLIENT_MESSAGE = (
11
+ "The provided Redis client is a plain {client_type} connected to a Redis Cluster node "
12
+ "('INFO cluster' reports cluster_enabled=1). Use redis.RedisCluster or "
13
+ "redis.asyncio.RedisCluster instead, and use a hash-tagged queue name such as '{{myqueue}}' "
14
+ "so all queue keys share one Redis Cluster slot."
15
+ )
9
16
 
10
17
 
11
18
  def _redis_cluster_key_slot(key: str) -> int:
12
19
  return key_slot(key.encode("utf-8"))
13
20
 
14
21
 
22
+ def redis_info_reports_cluster_enabled(info: object) -> bool:
23
+ if not isinstance(info, Mapping):
24
+ return False
25
+
26
+ value = info.get("cluster_enabled")
27
+ if value is None:
28
+ value = info.get(b"cluster_enabled")
29
+ if isinstance(value, bytes):
30
+ value = value.decode("utf-8", errors="replace")
31
+ if isinstance(value, str):
32
+ return value.strip() == "1"
33
+ return value == 1
34
+
35
+
36
+ def plain_redis_cluster_client_error(client_type: str) -> ConfigurationError:
37
+ return ConfigurationError(PLAIN_REDIS_CLUSTER_CLIENT_MESSAGE.format(client_type=client_type))
38
+
39
+
15
40
  def validate_queue_keys_for_redis_cluster(
16
41
  key_manager: QueueKeyManager,
17
42
  *,
@@ -27,7 +27,11 @@ from redis_message_queue._exceptions import (
27
27
  QueueDrainedError,
28
28
  )
29
29
  from redis_message_queue._queue_key_manager import QueueKeyManager, validate_callable_deduplication_key
30
- from redis_message_queue._redis_cluster import validate_queue_keys_for_redis_cluster
30
+ from redis_message_queue._redis_cluster import (
31
+ plain_redis_cluster_client_error,
32
+ redis_info_reports_cluster_enabled,
33
+ validate_queue_keys_for_redis_cluster,
34
+ )
31
35
  from redis_message_queue._stored_message import (
32
36
  ClaimedMessage,
33
37
  MessageData,
@@ -242,16 +246,32 @@ def _validate_cluster_configuration(
242
246
  client: redis.asyncio.Redis | None = None,
243
247
  gateway: AbstractRedisGateway | None = None,
244
248
  dead_letter_queue: str | None = None,
245
- ) -> None:
246
- if client is not None and isinstance(client, redis.asyncio.RedisCluster):
247
- validate_queue_keys_for_redis_cluster(key_manager, dead_letter_queue=dead_letter_queue)
248
- return
249
+ ) -> bool:
250
+ if client is not None:
251
+ if isinstance(client, redis.asyncio.RedisCluster):
252
+ validate_queue_keys_for_redis_cluster(key_manager, dead_letter_queue=dead_letter_queue)
253
+ return False
254
+ return type(client) is redis.asyncio.Redis
249
255
  if gateway is None or not gateway.is_redis_cluster:
250
- return
256
+ return False
251
257
  validate_queue_keys_for_redis_cluster(
252
258
  key_manager,
253
259
  dead_letter_queue=gateway.dead_letter_queue,
254
260
  )
261
+ return False
262
+
263
+
264
+ async def _plain_redis_client_reports_cluster(client: redis.asyncio.Redis) -> bool:
265
+ try:
266
+ info = await client.info("cluster")
267
+ except redis.exceptions.RedisError as exc:
268
+ logger.warning(
269
+ "Could not verify whether plain Redis client is connected to a Redis Cluster node; "
270
+ "trusting the provided client: %s",
271
+ exc,
272
+ )
273
+ return False
274
+ return redis_info_reports_cluster_enabled(info)
255
275
 
256
276
 
257
277
  def _derive_dead_letter_queue(name: str, key_separator: str) -> str:
@@ -508,6 +528,12 @@ class RedisMessageQueue:
508
528
  disable lease-based crash recovery; messages left in ``processing`` by a
509
529
  crashed worker are then not reclaimed automatically.
510
530
 
531
+ ``visibility_timeout_seconds`` is a Redis server-time lease, not a
532
+ handler runtime limit. Long-running handlers are not interrupted; if the
533
+ lease expires, another consumer can reclaim and process the same message
534
+ concurrently. A forward step in the Redis server clock can make a live
535
+ lease appear expired before that much real processing time has elapsed.
536
+
511
537
  ``max_delivery_count`` defaults to 10 on the built-in ``client=`` path.
512
538
  Messages reclaimed more than this many times are routed to the
513
539
  auto-derived dead-letter queue. Set it to ``None`` for unlimited
@@ -531,13 +557,19 @@ class RedisMessageQueue:
531
557
  waits for capacity before raising ``QueueBackpressureError``. ``0``
532
558
  performs a single immediate capacity check.
533
559
 
560
+ ``key_separator`` only controls generated Redis key names; rmq has no
561
+ fixed library prefix. Do not customize it to overlap another Redis
562
+ task library's namespace, such as ``":queue:"`` with RQ-style keys.
563
+
534
564
  ``interrupt`` accepts a ``BaseGracefulInterruptHandler``; pass
535
565
  ``GracefulInterruptHandler()`` for prompt Ctrl-C / termination handling
536
566
  in polling waits. ``on_heartbeat_failure`` is a zero-argument callable
537
567
  or coroutine callable invoked when lease renewal fails. ``on_event`` is
538
- an async callback receiving best-effort QueueEvent lifecycle
539
- notifications; callback failures are logged and converted to
540
- RuntimeWarning without interrupting queue operations.
568
+ telemetry only: an async callback receiving best-effort QueueEvent
569
+ lifecycle notifications. Callback failures are logged and converted to
570
+ RuntimeWarning without influencing ack/nack or any other message
571
+ outcome. Do not use it for correctness-critical callbacks or follow-up
572
+ writes.
541
573
  """
542
574
  self.key = QueueKeyManager(name, key_separator=key_separator)
543
575
  if not isinstance(deduplication, bool):
@@ -639,6 +671,7 @@ class RedisMessageQueue:
639
671
  self._drained = False
640
672
  self._publish_lock = asyncio.Lock()
641
673
  self._aclose_lock = asyncio.Lock()
674
+ self._cluster_validation_lock = asyncio.Lock()
642
675
  self._aclose_result: bool | None = None
643
676
  self._deduplication = deduplication
644
677
  self._enable_completed_queue = enable_completed_queue
@@ -650,6 +683,7 @@ class RedisMessageQueue:
650
683
  self._heartbeat_interval_seconds = None
651
684
  self._warned_no_lease_for_heartbeat = False
652
685
  self._requires_claimed_message = False
686
+ self._plain_redis_cluster_probe_client: redis.asyncio.Redis | None = None
653
687
 
654
688
  if gateway is not None:
655
689
  visibility_timeout_was_configured = visibility_timeout_seconds not in (
@@ -712,7 +746,8 @@ class RedisMessageQueue:
712
746
  dead_letter_queue = (
713
747
  _derive_dead_letter_queue(name, key_separator) if max_delivery_count is not None else None
714
748
  )
715
- _validate_cluster_configuration(self.key, client=client, dead_letter_queue=dead_letter_queue)
749
+ if _validate_cluster_configuration(self.key, client=client, dead_letter_queue=dead_letter_queue):
750
+ self._plain_redis_cluster_probe_client = client
716
751
  self._heartbeat_interval_seconds = _validate_heartbeat_interval_seconds(
717
752
  heartbeat_interval_seconds,
718
753
  visibility_timeout_seconds,
@@ -791,6 +826,18 @@ class RedisMessageQueue:
791
826
  stacklevel=2,
792
827
  )
793
828
 
829
+ async def _ensure_plain_redis_client_is_not_cluster(self) -> None:
830
+ client = self._plain_redis_cluster_probe_client
831
+ if client is None:
832
+ return
833
+ async with self._cluster_validation_lock:
834
+ client = self._plain_redis_cluster_probe_client
835
+ if client is None:
836
+ return
837
+ if await _plain_redis_client_reports_cluster(client):
838
+ raise plain_redis_cluster_client_error(type(client).__name__)
839
+ self._plain_redis_cluster_probe_client = None
840
+
794
841
  async def publish(self, message: str | dict) -> bool:
795
842
  """Publish a message.
796
843
 
@@ -799,6 +846,17 @@ class RedisMessageQueue:
799
846
  ``TypeError`` to avoid silent ``json.dumps`` coercion that would
800
847
  collapse distinct keys into the same dedup key (e.g. ``{1: "x"}``
801
848
  vs ``{"1": "x"}``).
849
+
850
+ Dict payloads are JSON-encoded data, not Python object serialization.
851
+ JSON does not preserve every Python type: tuples become lists, raw set
852
+ values raise unless converted to lists before publish, and custom
853
+ objects raise. Plan dict payload schemas in JSON-native types only.
854
+
855
+ Deduplication and publish retry-safety markers are Redis TTL keys. A
856
+ large forward step in Redis server expiration time during a retry
857
+ window can expire those markers before the Python-side monotonic retry
858
+ budget elapses, allowing a duplicate publish under that extreme
859
+ anomaly.
802
860
  """
803
861
  async with self._publish_lock:
804
862
  if self._drained:
@@ -808,6 +866,7 @@ class RedisMessageQueue:
808
866
  async def _publish_unlocked(self, message: str | dict) -> bool:
809
867
  started_at = time.perf_counter()
810
868
  try:
869
+ await self._ensure_plain_redis_client_is_not_cluster()
811
870
  if not isinstance(message, (str, dict)):
812
871
  raise TypeError(f"'message' must be a str or dict, got {type(message).__name__}")
813
872
  if isinstance(message, dict):
@@ -866,6 +925,20 @@ class RedisMessageQueue:
866
925
 
867
926
  Yields ``str`` if your client uses ``decode_responses=True``, else
868
927
  ``bytes``. Match the client setting to the type your handler expects.
928
+
929
+ Important: exceptions raised inside the ``async with`` block are
930
+ terminal. rmq is a payload queue, not a task framework; handler
931
+ exceptions do not requeue the message. With
932
+ ``enable_failed_queue=False``, the message is removed from
933
+ ``processing``; with ``enable_failed_queue=True``, it is moved to the
934
+ failed list.
935
+
936
+ If the task is cancelled after a message is claimed and cleanup cannot
937
+ run, the claimed message and lease metadata remain in Redis until a
938
+ later consumer claim triggers visibility-timeout reclaim. With
939
+ visibility timeouts enabled, this is at-least-once recovery semantics:
940
+ the message is delayed by the lease, not lost. Use ``aclose()`` for an
941
+ explicit async drain path during shutdown.
869
942
  """
870
943
  claim_started_at = time.perf_counter()
871
944
  if self._draining:
@@ -873,6 +946,7 @@ class RedisMessageQueue:
873
946
  yield None
874
947
  return
875
948
  try:
949
+ await self._ensure_plain_redis_client_is_not_cluster()
876
950
  claimed_message = await self._wait_for_message_and_move()
877
951
  if claimed_message is not None:
878
952
  if not isinstance(claimed_message, (ClaimedMessage, str, bytes)):
@@ -1157,6 +1231,16 @@ class RedisMessageQueue:
1157
1231
  but no further claims are taken. Callers must await any
1158
1232
  in-flight ``process_message`` tasks separately — ``aclose()`` does
1159
1233
  not cancel them.
1234
+
1235
+ ``timeout`` is measured with the event loop's monotonic clock, but
1236
+ visibility leases being recovered are anchored to Redis server
1237
+ ``TIME``. A forward step in the Redis server clock can make leases
1238
+ eligible for reclaim earlier than real elapsed handler time.
1239
+
1240
+ ``aclose()`` is queue-instance and process-local. A separate process,
1241
+ or a separate ``RedisMessageQueue`` instance using the same Redis keys,
1242
+ is not marked drained by this call. For multi-process graceful
1243
+ shutdown, each process must drain its own queue instances.
1160
1244
  """
1161
1245
  if timeout is not None and (not isinstance(timeout, (int, float)) or isinstance(timeout, bool)):
1162
1246
  raise TypeError(f"'timeout' must be a number or None, got {type(timeout).__name__}")
@@ -1192,7 +1276,11 @@ class RedisMessageQueue:
1192
1276
  return result
1193
1277
 
1194
1278
  async def drain(self, timeout: float | None = None) -> bool:
1195
- """Alias of :meth:`aclose` for explicit async drain naming."""
1279
+ """Alias of :meth:`aclose` for explicit async drain naming.
1280
+
1281
+ See :meth:`aclose` for process-local drain and Redis server-time lease
1282
+ caveats.
1283
+ """
1196
1284
  return await self.aclose(timeout)
1197
1285
 
1198
1286
  def __repr__(self) -> str:
@@ -37,6 +37,12 @@ class GracefulInterruptHandler(BaseGracefulInterruptHandler):
37
37
  repeated signal for an owned handler falls back to the previous/default
38
38
  disposition so operators can still force termination (for example, a second
39
39
  Ctrl+C raises ``KeyboardInterrupt``).
40
+
41
+ Process-global signal ownership cannot be safely chained. If rmq runs in
42
+ the same process as Celery, RQ, or Dramatiq CLI workers, the libraries may
43
+ overwrite each other's SIGTERM/SIGINT handlers. Prefer one top-level signal
44
+ owner that calls ``queue.drain()`` or sets an application stop event, and
45
+ run sibling workers in separate processes.
40
46
  """
41
47
 
42
48
  _DEFAULT_SIGNALS = (
@@ -28,7 +28,11 @@ from redis_message_queue._exceptions import (
28
28
  QueueDrainedError,
29
29
  )
30
30
  from redis_message_queue._queue_key_manager import QueueKeyManager, validate_callable_deduplication_key
31
- from redis_message_queue._redis_cluster import validate_queue_keys_for_redis_cluster
31
+ from redis_message_queue._redis_cluster import (
32
+ plain_redis_cluster_client_error,
33
+ redis_info_reports_cluster_enabled,
34
+ validate_queue_keys_for_redis_cluster,
35
+ )
32
36
  from redis_message_queue._redis_gateway import RedisGateway
33
37
  from redis_message_queue._stored_message import (
34
38
  ClaimedMessage,
@@ -177,8 +181,11 @@ def _validate_cluster_configuration(
177
181
  gateway: AbstractRedisGateway | None = None,
178
182
  dead_letter_queue: str | None = None,
179
183
  ) -> None:
180
- if client is not None and isinstance(client, redis.RedisCluster):
181
- validate_queue_keys_for_redis_cluster(key_manager, dead_letter_queue=dead_letter_queue)
184
+ if client is not None:
185
+ if isinstance(client, redis.RedisCluster):
186
+ validate_queue_keys_for_redis_cluster(key_manager, dead_letter_queue=dead_letter_queue)
187
+ return
188
+ _validate_plain_redis_client_not_cluster(client)
182
189
  return
183
190
  if gateway is None or not gateway.is_redis_cluster:
184
191
  return
@@ -188,6 +195,22 @@ def _validate_cluster_configuration(
188
195
  )
189
196
 
190
197
 
198
+ def _validate_plain_redis_client_not_cluster(client: redis.Redis) -> None:
199
+ if type(client) is not redis.Redis:
200
+ return
201
+ try:
202
+ info = client.info("cluster")
203
+ except redis.exceptions.RedisError as exc:
204
+ logger.warning(
205
+ "Could not verify whether plain Redis client is connected to a Redis Cluster node; "
206
+ "trusting the provided client: %s",
207
+ exc,
208
+ )
209
+ return
210
+ if redis_info_reports_cluster_enabled(info):
211
+ raise plain_redis_cluster_client_error(type(client).__name__)
212
+
213
+
191
214
  def _derive_dead_letter_queue(name: str, key_separator: str) -> str:
192
215
  return f"{name}{key_separator}{_AUTO_DEAD_LETTER_QUEUE_SUFFIX}"
193
216
 
@@ -449,6 +472,12 @@ class RedisMessageQueue:
449
472
  disable lease-based crash recovery; messages left in ``processing`` by a
450
473
  crashed worker are then not reclaimed automatically.
451
474
 
475
+ ``visibility_timeout_seconds`` is a Redis server-time lease, not a
476
+ handler runtime limit. Long-running handlers are not interrupted; if the
477
+ lease expires, another consumer can reclaim and process the same message
478
+ concurrently. A forward step in the Redis server clock can make a live
479
+ lease appear expired before that much real processing time has elapsed.
480
+
452
481
  ``max_delivery_count`` defaults to 10 on the built-in ``client=`` path.
453
482
  Messages reclaimed more than this many times are routed to the
454
483
  auto-derived dead-letter queue. Set it to ``None`` for unlimited
@@ -472,12 +501,18 @@ class RedisMessageQueue:
472
501
  waits for capacity before raising ``QueueBackpressureError``. ``0``
473
502
  performs a single immediate capacity check.
474
503
 
504
+ ``key_separator`` only controls generated Redis key names; rmq has no
505
+ fixed library prefix. Do not customize it to overlap another Redis
506
+ task library's namespace, such as ``":queue:"`` with RQ-style keys.
507
+
475
508
  ``interrupt`` accepts a ``BaseGracefulInterruptHandler``; pass
476
509
  ``GracefulInterruptHandler()`` for prompt Ctrl-C / termination handling
477
510
  in polling waits. ``on_heartbeat_failure`` is a zero-argument callable
478
- invoked when lease renewal fails. ``on_event`` receives best-effort
479
- QueueEvent lifecycle notifications; callback failures are logged and
480
- converted to RuntimeWarning without interrupting queue operations.
511
+ invoked when lease renewal fails. ``on_event`` is telemetry only and
512
+ receives best-effort QueueEvent lifecycle notifications; callback
513
+ failures are logged and converted to RuntimeWarning without influencing
514
+ ack/nack or any other message outcome. Do not use it for
515
+ correctness-critical callbacks or follow-up writes.
481
516
  """
482
517
  self.key = QueueKeyManager(name, key_separator=key_separator)
483
518
  if not isinstance(deduplication, bool):
@@ -752,6 +787,17 @@ class RedisMessageQueue:
752
787
  ``TypeError`` to avoid silent ``json.dumps`` coercion that would
753
788
  collapse distinct keys into the same dedup key (e.g. ``{1: "x"}``
754
789
  vs ``{"1": "x"}``).
790
+
791
+ Dict payloads are JSON-encoded data, not Python object serialization.
792
+ JSON does not preserve every Python type: tuples become lists, raw set
793
+ values raise unless converted to lists before publish, and custom
794
+ objects raise. Plan dict payload schemas in JSON-native types only.
795
+
796
+ Deduplication and publish retry-safety markers are Redis TTL keys. A
797
+ large forward step in Redis server expiration time during a retry
798
+ window can expire those markers before the Python-side monotonic retry
799
+ budget elapses, allowing a duplicate publish under that extreme
800
+ anomaly.
755
801
  """
756
802
  with self._publish_lock:
757
803
  if self._drained.is_set():
@@ -829,6 +875,25 @@ class RedisMessageQueue:
829
875
 
830
876
  Yields ``str`` if your client uses ``decode_responses=True``, else
831
877
  ``bytes``. Match the client setting to the type your handler expects.
878
+
879
+ Important: exceptions raised inside the ``with`` block are terminal.
880
+ rmq is a payload queue, not a task framework; handler exceptions do not
881
+ requeue the message. With ``enable_failed_queue=False``, the message is
882
+ removed from ``processing``; with ``enable_failed_queue=True``, it is
883
+ moved to the failed list.
884
+
885
+ This sync context manager only observes whether the block raises. It
886
+ does not inspect handler return values; if your handler returns a
887
+ coroutine or other awaitable, the awaitable can be dropped while the
888
+ message is acked. Use ``redis_message_queue.asyncio.RedisMessageQueue``
889
+ for async handlers. An ergonomic callback API that detects this is
890
+ planned for v8.1.
891
+
892
+ If the process is killed mid-handler, the claimed message and lease
893
+ metadata remain in Redis until a later consumer claim triggers
894
+ visibility-timeout reclaim. With visibility timeouts enabled, this is
895
+ at-least-once recovery semantics: the message is delayed by the lease,
896
+ not lost.
832
897
  """
833
898
  claim_started_at = time.perf_counter()
834
899
  if self._draining:
@@ -1108,10 +1173,20 @@ class RedisMessageQueue:
1108
1173
  ``None`` waits indefinitely, ``0`` skips the loop entirely. The
1109
1174
  flag is set regardless of the timeout value.
1110
1175
 
1176
+ ``timeout`` is measured with Python monotonic time, but visibility
1177
+ leases being recovered are anchored to Redis server ``TIME``. A forward
1178
+ step in the Redis server clock can make leases eligible for reclaim
1179
+ earlier than real elapsed handler time.
1180
+
1111
1181
  Returns ``True`` if all pending claim ids were recovered (or none
1112
1182
  were present); ``False`` if recovery hit the deadline or a
1113
1183
  transient Redis error left claim ids pending.
1114
1184
 
1185
+ Drain is queue-instance and process-local. A separate process, or a
1186
+ separate ``RedisMessageQueue`` instance using the same Redis keys, is
1187
+ not marked drained by this call. For multi-process graceful shutdown,
1188
+ each process must drain its own queue instances.
1189
+
1115
1190
  Drain does **not** cancel in-flight ``process_message`` handlers;
1116
1191
  the caller must coordinate handler exits via its own scheduling
1117
1192
  (joining threads / awaiting tasks). Heartbeat stop remains