deadpool-executor 2026.6.1__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,842 @@
1
+ Metadata-Version: 2.4
2
+ Name: deadpool-executor
3
+ Version: 2026.6.1
4
+ Summary: Deadpool
5
+ Author-email: Caleb Hattingh <caleb.hattingh@gmail.com>
6
+ Requires-Python: >=3.10
7
+ Description-Content-Type: text/x-rst
8
+ Classifier: License :: OSI Approved :: GNU Affero General Public License v3 or later (AGPLv3+)
9
+ Classifier: License :: OSI Approved :: Apache Software License
10
+ Classifier: Development Status :: 3 - Alpha
11
+ Classifier: Intended Audience :: Developers
12
+ Classifier: Operating System :: POSIX :: Linux
13
+ Classifier: Programming Language :: Python :: 3.10
14
+ Classifier: Programming Language :: Python :: 3.11
15
+ Classifier: Programming Language :: Python :: 3.12
16
+ Classifier: Programming Language :: Python :: 3.13
17
+ Classifier: Programming Language :: Python :: 3.14
18
+ Classifier: Programming Language :: Python :: Implementation
19
+ Classifier: Programming Language :: Python :: Implementation :: CPython
20
+ License-File: LICENSE-AGPL
21
+ License-File: LICENSE-Apache
22
+ Requires-Dist: psutil
23
+ Requires-Dist: setproctitle
24
+ Requires-Dist: pytest >= 7 ; extra == "test"
25
+ Requires-Dist: pytest-cov ; extra == "test"
26
+ Requires-Dist: flake8 ; extra == "test"
27
+ Requires-Dist: coverage[toml] ; extra == "test"
28
+ Project-URL: Documentation, https://github.com/cjrh/deadpool
29
+ Project-URL: Home, https://github.com/cjrh/deadpool
30
+ Project-URL: Source, https://github.com/cjrh/deadpool
31
+ Provides-Extra: test
32
+
33
+ .. |ci| image:: https://github.com/cjrh/deadpool/workflows/Python%20application/badge.svg
34
+ :target: https://github.com/cjrh/deadpool/actions
35
+
36
+ .. |coverage| image:: https://coveralls.io/repos/github/cjrh/deadpool/badge.svg?branch=main
37
+ :target: https://coveralls.io/github/cjrh/deadpool?branch=main
38
+
39
+ .. |pyversions| image:: https://img.shields.io/pypi/pyversions/deadpool-executor.svg
40
+ :target: https://pypi.python.org/pypi/deadpool-executor
41
+
42
+ .. |tag| image:: https://img.shields.io/github/tag/cjrh/deadpool.svg
43
+ :target: https://img.shields.io/github/tag/cjrh/deadpool.svg
44
+
45
+ .. |install| image:: https://img.shields.io/badge/install-pip%20install%20deadpool--executor-ff69b4.svg
46
+ :target: https://img.shields.io/badge/install-pip%20install%20deadpool--executor-ff69b4.svg
47
+
48
+ .. |pypi| image:: https://img.shields.io/pypi/v/deadpool-executor.svg
49
+ :target: https://pypi.org/project/deadpool-executor/
50
+
51
+ .. |calver| image:: https://img.shields.io/badge/calver-YYYY.MM.MINOR-22bfda.svg
52
+ :alt: This project uses calendar-based versioning scheme
53
+ :target: http://calver.org/
54
+
55
+ .. |pepy| image:: https://pepy.tech/badge/deadpool-executor
56
+ :alt: Downloads
57
+ :target: https://pepy.tech/project/deadpool-executor
58
+
59
+ .. |black| image:: https://img.shields.io/badge/code%20style-black-000000.svg
60
+ :alt: This project uses the "black" style formatter for Python code
61
+ :target: https://github.com/python/black
62
+
63
+ .. |openssf| image:: https://api.securityscorecards.dev/projects/github.com/cjrh/deadpool/badge
64
+ :alt: OpenSSF Scorecard
65
+ :target: https://api.securityscorecards.dev/projects/github.com/cjrh/deadpool
66
+
67
+ |ci| |coverage| |pyversions| |tag| |install| |pypi| |calver| |pepy| |black| |openssf|
68
+
69
+ .. sectnum::
70
+
71
+ .. contents::
72
+ :local:
73
+ :depth: 2
74
+ :backlinks: entry
75
+
76
+ Deadpool
77
+ ========
78
+
79
+ ``Deadpool`` is a process pool that is really hard to kill.
80
+
81
+ ``Deadpool`` is an implementation of the ``Executor`` interface
82
+ in the ``concurrent.futures`` standard library. ``Deadpool`` is
83
+ a process pool executor, quite similar to the stdlib's
84
+ `ProcessPoolExecutor`_.
85
+
86
+ This document assumes that you are familiar with the stdlib
87
+ `ProcessPoolExecutor`_. If you are not, it is important
88
+ to understand that ``Deadpool`` makes very specific tradeoffs that
89
+ can result in quite different behaviour to the stdlib
90
+ implementation.
91
+
92
+ Licence
93
+ =======
94
+
95
+ This project can be licenced either under the terms of the `Apache 2.0`_
96
+ licence, or the `Affero GPL 3.0`_ licence. The choice is yours.
97
+
98
+ Installation
99
+ ============
100
+
101
+ The python package name is *deadpool-executor*, so to install
102
+ you must type ``$ pip install deadpool-executor``. The import
103
+ name is *deadpool*, so in your Python code you must type
104
+ ``import deadpool`` to use it.
105
+
106
+ I try quite hard to keep dependencies to a minimum. Currently
107
+ ``Deadpool`` has no dependencies other than ``psutil`` which
108
+ is simply too useful to avoid for this library.
109
+
110
+ Why would I want to use this?
111
+ =============================
112
+
113
+ I created ``Deadpool`` because I became frustrated with the
114
+ stdlib `ProcessPoolExecutor`_, and various other community
115
+ implementations of process pools. In particular, I had a use-case
116
+ that required a high server uptime, but also had variable and
117
+ unpredictable memory requirements such that certain tasks could
118
+ trigger the `OOM killer`_, often resulting in a "broken" process
119
+ pool. I also needed task-specific timeouts that could kill a "hung"
120
+ task, which the stdlib executor doesn't provide.
121
+
122
+ You might wonder, isn't it bad to just kill a task like that?
123
+ In my use-case, we had extensive logging and monitoring to alert
124
+ us if any tasks failed; but it was paramount that our services
125
+ continue to operate even when tasks got killed in OOM scenarios,
126
+ or specific tasks took too long. This is the primary trade-off
127
+ that ``Deadpool`` offers: the pool will not break, but tasks
128
+ can receive SIGKILL under certain conditions. This trade-off
129
+ is likely fine if you've seen many OOMs break your pools.
130
+
131
+ I also tried using the `Pebble <https://github.com/noxdafox/pebble>`_
132
+ community process pool. This is a cool project, featuring several
133
+ of the properties I've been looking for such as timeouts, and
134
+ more resilient operation. However, during testing I found several
135
+ occurrences of a mysterious `RuntimeError`_ that caused the Pebble
136
+ pool to become broken and no longer accept new tasks.
137
+
138
+ My goal with ``Deadpool`` is that **the pool must never enter
139
+ a broken state**. Any means by which that can happen will be
140
+ considered a bug.
141
+
142
+ What differs from `ProcessPoolExecutor`_?
143
+ =========================================
144
+
145
+ ``Deadpool`` is generally similar to `ProcessPoolExecutor`_ since it executes
146
+ tasks in subprocesses, and implements the standard ``Executor`` abstract
147
+ interface. We can draw a few comparisons to the stdlib pool to guide
148
+ your decision process about whether this makes sense for your use-case:
149
+
150
+ Similarities
151
+ ------------
152
+
153
+ - ``Deadpool`` also supports the
154
+ ``max_tasks_per_child`` parameter (a new feature in
155
+ Python 3.11, although it was available in `multiprocessing.Pool`_
156
+ since Python 3.2).
157
+ - The "initializer" callback in ``Deadpool`` works the same.
158
+ - ``Deadpool`` defaults to the `forkserver <https://docs.python.org/3.11/library/multiprocessing.html#contexts-and-start-methods>`_ multiprocessing
159
+ context, unlike the stdlib pool which defaults to ``fork`` on
160
+ Linux. It's just a setting though, you can change it in the same way as
161
+ with the stdlib pool. Like the stdlib, I strongly advise you to avoid
162
+ using ``fork`` because propagation threads and locks via fork is
163
+ going to ruin your day eventually. While this is a difference to the
164
+ default behaviour of the stdlib pool, it's not a difference in
165
+ behaviour to the stdlib pool when you use the ``forkserver`` context
166
+ which is the recommended context for multiprocessing.
167
+
168
+ Differences in existing behaviour
169
+ ---------------------------------
170
+
171
+ ``Deadpool`` differs from the stdlib pool in the following ways:
172
+
173
+ - If a ``Deadpool`` subprocess in the pool is killed by some
174
+ external actor, for example, the OS runs out of memory and the
175
+ `OOM killer`_ kills a pool subprocess that is using too much memory,
176
+ ``Deadpool`` does not care and further operation is unaffected.
177
+ ``Deadpool`` will not, and indeed cannot raise
178
+ `BrokenProcessPool <https://docs.python.org/3/library/concurrent.futures.html?highlight=broken%20process%20pool#concurrent.futures.process.BrokenProcessPool>`_ or
179
+ `BrokenExecutor <https://docs.python.org/3/library/concurrent.futures.html?highlight=broken%20process%20pool#concurrent.futures.BrokenExecutor>`_.
180
+ - ``Deadpool`` precreates all subprocesses up to the pool size on
181
+ creation.
182
+ - ``Deadpool`` tasks can have priorities. When the executor chooses
183
+ the next pending task to schedule to a subprocess, it chooses the
184
+ pending task with the highest priority. This gives you a way of
185
+ prioritizing certain kinds of tasks. For example, you might give
186
+ UI-sensitive tasks a higher priority to deliver a more snappy
187
+ user experience to your users. The priority can be specified in
188
+ the ``submit`` call.
189
+ - The shutdown parameters ``wait`` and ``cancel_futures`` can behave
190
+ differently to how they work in the `ProcessPoolExecutor`_. This is
191
+ discussed in more detail later in this document.
192
+ - ``Deadpool`` currently only works on Linux. There isn't any specific
193
+ reason it can't work on other platforms. The malloc trim feature also
194
+ requires a glibc system, so probably won't work on Alpine.
195
+
196
+ New features in Deadpool
197
+ ------------------------
198
+
199
+ ``Deadpool`` has the following features that are not present in the
200
+ stdlib pool:
201
+
202
+ - With ``Deadpool`` you can provider a "finalizer" callback that will
203
+ fire before a subprocess is shut down or killed. The finalizer callback
204
+ might be executed in a different thread than the main thread of the
205
+ subprocess, so don't rely on the callback running in the main
206
+ subprocess thread. There are certain circumstances where the finalizer
207
+ will not run at all, such as when the subprocess is killed by the OS
208
+ due to an out-of-memory (OOM) condition. So don't design your application
209
+ such that the finalizer is required to run for correct operation.
210
+ - Even though ``Deadpool`` typically uses a hard kill to remove
211
+ subprocesses, it does still run any handlers registered with
212
+ ``atexit``.
213
+ - ``Deadpool`` tasks can have timeouts. When a task hits the timeout,
214
+ the underlying subprocess in the pool is killed with ``SIGKILL``.
215
+ The entire process tree of that subprocess is killed. Your application
216
+ logic needs to handle this. The ``finalizer`` will not run.
217
+ - ``Deadpool`` also allows a ``finalizer``, with corresponding
218
+ ``finalargs``, that will be called after a task is executed on
219
+ a subprocess, but before the subprocess terminates. It is
220
+ analogous to the ``initializer`` and ``initargs`` parameters.
221
+ Just like the ``initializer`` callable, the ``finalizer``
222
+ callable is executed inside the subprocess. It is not guaranteed that
223
+ the finalizer will always run. If a process is killed, e.g. due to a
224
+ timeout or any other reason, the finalizer will not run. The finalizer
225
+ could be used for things like flushing pending monitoring messages,
226
+ such as traces and so on.
227
+ - ``Deadpool`` can ask the system allocator (Linux only) to return
228
+ unused memory back to the OS based on exceeding a max threshold RSS.
229
+ For long-running pools and modern
230
+ kernels, the system memory allocator can hold onto unused memory
231
+ for a surprisingly long time, and coupled with bloat due to
232
+ memory fragmentation, this can result in carrying very large
233
+ RSS values in your pool. The ``max_tasks_per_child`` helps with
234
+ this because a subprocess is entirely erased when the max is
235
+ reached, but it does mean that periodically there will be a small
236
+ latency penalty from constructing the replacement subprocess. In
237
+ my opinion, ``max_tasks_per_child`` is appropriate for when you
238
+ know or suspect there's a real memory leak somewhere in your code
239
+ (or a 3rd-party package!), and the easiest way to deal with that
240
+ right now is just to periodically remove a process.
241
+ - ``Deadpool`` can propagate ``os.environ`` to the subprocesses.
242
+ Normally, env vars present at the time of the "main" process will
243
+ propagate to subprocesses, but dynamically modified env vars
244
+ via ``os.environ`` will not. Actually, it depends on the start
245
+ method, with ``fork`` doing the propagation, and ``forkserver``
246
+ and ``spawn`` not doing it. The parameter ``propagate_environ``,
247
+ e.g., ``propagate_environ=os.environ``, re-enables this for
248
+ ``forkserver`` and ``spawn``. The supplied mapping will be
249
+ applied to the subprocesses as they are created. This also means
250
+ that if you want to modify some settings, you can modify the
251
+ mapping object at any time, and new subprocesses created after
252
+ that modification will get the new vars. One example use-case
253
+ is dynamically changing the logging level within subprocesses.
254
+
255
+ Minimum and Maximum Workers
256
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~
257
+
258
+ ``Deadpool`` has a ``min_workers`` and ``max_workers`` parameter.
259
+ While ``max_workers`` is the same as the stdlib pool, ``min_workers``
260
+ is a new feature.
261
+
262
+ The ``min_workers`` parameter allows deadpool to "scale down" the
263
+ pool when it is idle. This is another strategy alongside other
264
+ features like ``max_tasks_per_child`` and ``max_worker_memory_bytes``
265
+ to help deal with memory bloat in long-running pools.
266
+
267
+ Statistics
268
+ ~~~~~~~~~~
269
+
270
+ Here is a very simple example of how to get statistics from the
271
+ executor:
272
+
273
+ .. code-block:: python
274
+
275
+ with deadpool.Deadpool() as exe:
276
+ fut = exe.submit(...)
277
+ stats = exe.get_statistics()
278
+
279
+ The call must be made while the executor is still alive. It
280
+ will succeed after the executor is shut down or closed, but
281
+ some of the statistics will be zeroed out.
282
+
283
+ The call to ``get_statistics`` will return a dictionary with the
284
+ following keys:
285
+
286
+ - ``tasks_received``: The total number of tasks submitted to the
287
+ executor. Does not mean that they started running, only that they
288
+ were successfully submitted.
289
+ - ``tasks_launched``: The total number of tasks that were launched
290
+ on a subprocess. This records the count of all tasks that were
291
+ successfully scheduled to run. These tasks were picked up from
292
+ the submit backlog and given to a worker process to execute.
293
+ - ``tasks_failed``: The total number of tasks that failed. This
294
+ includes tasks that raised an exception, and tasks that were
295
+ killed due to a timeout, and really any other reason that a task
296
+ failed.
297
+ - ``worker_processes_created``: The total number of subprocesses that
298
+ were ever created by the executor. This can be, and often will be
299
+ greater than the `max_workers` setting because there are many options
300
+ that can cause workers to be discarded and replaced. Examples of these
301
+ might be the ``max_tasks_per_child`` setting, or the ``min_workers``
302
+ setting, or the memory thresholds and so on.
303
+ - ``max_workers_busy_concurrently``: The maximum number of workers that
304
+ were ever busy at the same time. This is a useful statistic to
305
+ decide whether you might consider increasing or decreasing the size
306
+ of the pool. For example, if your ``max_workers`` is set to 100, but
307
+ after running for, say, a week, you see that ``max_workers_busy_concurrently``
308
+ is only 50, then you might consider reducing the pool size to 50.
309
+ The system memory manager on linux likes to hold onto heap memory.
310
+ If your have more workers than you need, you'll see that the system
311
+ memory usage over time is going to be higher than it needs to be
312
+ because even when the pool is fully idle, you will still observe
313
+ the persistent worker processes having a large memory allocation
314
+ even though no jobs are running. This is a symptom of malloc
315
+ retention behaviour.
316
+ - ``worker_processes_still_alive``: The number of worker processes that
317
+ are still alive. This includes both idle and busy worker processes.
318
+ This is mainly a debugging statistic that I can use to check whether
319
+ worker processes are "leaking" somehow and not being cleaned up
320
+ correctly. This number should not be greater than the ``max_workers``.
321
+ (It could be, temporarily, depending on the exact timing and strategy
322
+ in the inner workings of the executor, but on average it should not)
323
+ - ``worker_processes_idle``: The number of worker processes that are idle.
324
+ - ``worker_processes_busy``: The number of worker processes that are busy.
325
+
326
+
327
+ Here is an example from the tests to explain what each of the
328
+ statistics mean:
329
+
330
+ .. code-block:: python
331
+
332
+ with deadpool.Deadpool(min_workers=5, max_workers=10) as exe:
333
+ futs = []
334
+ for _ in range(50):
335
+ futs.append(exe.submit(t, 0.05))
336
+ futs.append(exe.submit(f_err, Exception))
337
+
338
+ results = []
339
+ for fut in deadpool.as_completed(futs):
340
+ try:
341
+ results.append(fut.result())
342
+ except Exception:
343
+ pass
344
+
345
+ time.sleep(0.5)
346
+ stats = exe.get_statistics()
347
+
348
+ assert results == [0.05] * 50
349
+ print(f"{stats=}")
350
+ assert stats == {
351
+ "tasks_received": 100,
352
+ "tasks_launched": 100,
353
+ "tasks_failed": 50,
354
+ "worker_processes_created": 10,
355
+ "max_workers_busy_concurrently": 10,
356
+ "worker_processes_still_alive": 5,
357
+ "worker_processes_idle": 5,
358
+ "worker_processes_busy": 0,
359
+ }
360
+
361
+ In this example, we submit 100 tasks, 50 of which will raise an
362
+ exception. The executor will create 10 worker processes, and
363
+ will have a maximum of 10 workers busy at the same time. After
364
+ all the tasks are completed, we wait for a short time to allow
365
+ the executor to clean up any worker processes that are no longer
366
+ needed. The statistics should show that 5 worker processes are
367
+ still alive, and all of them are idle.
368
+
369
+ Show me some code
370
+ =================
371
+
372
+ Simple case
373
+ -----------
374
+
375
+ The simple case works exactly the same as with `ProcessPoolExecutor`_:
376
+
377
+ .. code-block:: python
378
+
379
+ import deadpool
380
+
381
+ def f():
382
+ return 123
383
+
384
+ with deadpool.Deadpool() as exe:
385
+ fut = exe.submit(f)
386
+ result = fut.result()
387
+
388
+ assert result == 123
389
+
390
+ It is intended that all the basic behaviour should "just work" in the
391
+ same way, and ``Deadpool`` should be a drop-in replacement for
392
+ `ProcessPoolExecutor`_; but there are some subtle differences so you
393
+ should read all of this document to see if any of those will affect you.
394
+
395
+ Timeouts
396
+ --------
397
+
398
+ If a timeout is reached on a task, the subprocess running that task will be
399
+ killed, as in ``SIGKILL``. ``Deadpool`` doesn't mind, but your own
400
+ application should: if you use timeouts it is likely important that your tasks
401
+ be `idempotent <https://en.wikipedia.org/wiki/Idempotence>`_, especially if
402
+ your application will restart tasks, or restart them after application deployment,
403
+ and other similar scenarios.
404
+
405
+ .. code-block:: python
406
+
407
+ import time
408
+ import deadpool
409
+
410
+ def f():
411
+ time.sleep(10.0)
412
+
413
+ with deadpool.Deadpool() as exe:
414
+ fut = exe.submit(f, deadpool_timeout=1.0)
415
+
416
+ with pytest.raises(deadpool.TimeoutError):
417
+ fut.result()
418
+
419
+ The parameter ``deadpool_timeout`` is special and consumed by ``Deadpool``
420
+ in the call. You can't use a parameter with this name in your function
421
+ kwargs.
422
+
423
+ Handling OOM killed situations
424
+ ------------------------------
425
+
426
+ .. code-block:: python
427
+
428
+ import time
429
+ import deadpool
430
+
431
+ def f():
432
+ x = list(range(10**100))
433
+
434
+ with deadpool.Deadpool() as exe:
435
+ fut = exe.submit(f, deadpool_timeout=1.0)
436
+
437
+ try:
438
+ result = fut.result()
439
+ except deadpool.ProcessError:
440
+ print("Oh no someone killed my task!")
441
+
442
+
443
+ As long as the OOM killer terminates merely a subprocess (and not the main
444
+ process), which is likely because it'll be your subprocess that is using too
445
+ much memory, this will not hurt the pool, and it will be able to receive and
446
+ process more tasks. Note that this event will show up as a ``ProcessError``
447
+ exception when accessing the future, so you have a way of at least tracking
448
+ these events.
449
+
450
+ Design Details
451
+ ==============
452
+
453
+ Typical Example - with timeouts
454
+ -------------------------------
455
+
456
+ Here's a typical example of how code using Deadpool might look. The
457
+ output of the code further below should be similar to the following:
458
+
459
+ .. code-block:: bash
460
+
461
+ $ python examples/entrypoint.py
462
+ ...................xxxxxxxxxxx.xxxxxxx.x.xxxxxxx.x
463
+ $
464
+
465
+ Each ``.`` is a successfully completed task, and each ``x`` is a task
466
+ that timed out. Below is the code for this example.
467
+
468
+ .. code-block:: python
469
+
470
+ import random, time
471
+ import deadpool
472
+
473
+
474
+ def work():
475
+ time.sleep(random.random() * 4.0)
476
+ print(".", end="", flush=True)
477
+ return 1
478
+
479
+
480
+ def main():
481
+ with deadpool.Deadpool() as exe:
482
+ futs = (exe.submit(work, deadpool_timeout=2.0) for _ in range(50))
483
+ for fut in deadpool.as_completed(futs):
484
+ try:
485
+ assert fut.result() == 1
486
+ except deadpool.TimeoutError:
487
+ print("x", end="", flush=True)
488
+
489
+
490
+ if __name__ == "__main__":
491
+ main()
492
+ print()
493
+
494
+ - The work function will be busy for a random time period between 0 and
495
+ 4 seconds.
496
+ - There is a ``deadpool_timeout`` kwarg given to the ``submit`` method.
497
+ This kwarg is special and will be consumed by Deadpool. You cannot
498
+ use this kwarg name for your own task functions.
499
+ - When a task completes, it prints out ``.`` internally. But when a task
500
+ raises a ``deadpool.TimeoutError``, a ``x`` will be printed out instead.
501
+ - When a task times out, keep in mind that the underlying process that
502
+ is executing that task is killed, literally with the ``SIGKILL`` signal.
503
+
504
+ Deadpool tasks have priority
505
+ ----------------------------
506
+
507
+ The example below is similar to the previous one for timeouts. In fact
508
+ this example retains the timeouts to show how the different features
509
+ compose together. In this example we create tasks with different
510
+ priorities, and we change the printed character of each task to show
511
+ that higher priority items are executed first.
512
+
513
+ The code example will print something similar to the following:
514
+
515
+ .. code-block:: bash
516
+
517
+ $ python examples/priorities.py
518
+ !!!!!xxxxxxxxxxx!x..!...x.xxxxxxxx.xxxx.x...xxxxxx
519
+
520
+ You can see how the ``!`` characters, used for indicating higher priority
521
+ tasks, appear towards the front indicating that they were executed sooner.
522
+ Below is the code.
523
+
524
+ .. code-block:: python
525
+
526
+ import random, time
527
+ import deadpool
528
+
529
+
530
+ def work(symbol):
531
+ time.sleep(random.random() * 4.0)
532
+ print(symbol, end="", flush=True)
533
+ return 1
534
+
535
+
536
+ def main():
537
+ with deadpool.Deadpool(max_backlog=100) as exe:
538
+ futs = []
539
+ for _ in range(25):
540
+ fut = exe.submit(work, ".",deadpool_timeout=2.0, deadpool_priority=10)
541
+ futs.append(fut)
542
+ fut = exe.submit(work, "!",deadpool_timeout=2.0, deadpool_priority=0)
543
+ futs.append(fut)
544
+
545
+ for fut in deadpool.as_completed(futs):
546
+ try:
547
+ assert fut.result() == 1
548
+ except deadpool.TimeoutError:
549
+ print("x", end="", flush=True)
550
+
551
+
552
+ if __name__ == "__main__":
553
+ main()
554
+ print()
555
+
556
+ - When the tasks are submitted, they are given a priority. The default
557
+ value for the ``deadpool_priority`` parameter is 0, but here we'll
558
+ write them out explicity. Half of the tasks will have priority 10 and
559
+ half will have priority 0.
560
+ - A lower value for the ``deadpool_priority`` parameters means a **higher**
561
+ priority. The highest priority allowed is indicated by 0. Negative
562
+ priority values are not allowed.
563
+ - I also specified the ``max_backlog`` parameter when creating the
564
+ Deadpool instance. This is discussed in more detail next, but quickly:
565
+ task priority can only be enforced on what is in the submitted backlog
566
+ of tasks, and the ``max_backlog`` parameter controls the depth of that
567
+ queue. If ``max_backlog`` is too low, then the window of prioritization
568
+ will not include tasks submitted later which might have higher priorities
569
+ than earlier-submitted tasks. The ``submit`` call will in fact block
570
+ once the ``max_backlog`` depth has been reached.
571
+
572
+ Controlling the backlog of submitted tasks
573
+ ------------------------------------------
574
+
575
+ By default, the ``max_backlog`` parameter is set to 1000. This parameter is
576
+ used to create the "submit queue" size. The submit queue is the place
577
+ where submitted tasks are held before they are executed in background
578
+ processes.
579
+
580
+ If the submit queue is large (``max_backlog``), it will mean
581
+ that a large number of tasks can be added to the system with the
582
+ ``submit`` method, even before any tasks have finished exiting. Conversely,
583
+ a low ``max_backlog`` parameter means that the submit queue will fill up
584
+ faster. If the submit queue is full, it means that the next call to
585
+ ``submit`` will block.
586
+
587
+ This kind of blocking is fine, and typically desired. It means that
588
+ backpressure from blocking is controlling the amount of work in flight.
589
+ By using a smaller ``max_backlog``, it means that you'll also be
590
+ limiting the amount of memory in use during the execution of all the tasks.
591
+
592
+ .. warning::
593
+
594
+ A blocking ``submit`` is dangerous if you call it from an asyncio
595
+ event loop thread, for example via ``loop.run_in_executor(...)``. If
596
+ the submit queue is full, the ``submit`` call will block the event
597
+ loop, stalling every other coroutine and task on that loop. This is
598
+ why the default ``max_backlog`` is deliberately large: with a high
599
+ value, ``submit`` is very unlikely to block in practice. If you are
600
+ *not* submitting from an event loop and you want backpressure (and
601
+ lower memory use), set ``max_backlog`` to a small value to make
602
+ ``submit`` block once the backlog is full.
603
+
604
+ However, if you nevertheless still accumulate received futures as my
605
+ example code above is doing, that accumulation, i.e., the list of futures,
606
+ will contribute to memory growth. If you have a large amount of work, it
607
+ will be better to set a *callback* function on each of the futures rather
608
+ than processing them by iterating over ``as_completed``.
609
+
610
+ The example below illustrates this technique for keeping memory
611
+ consumption down:
612
+
613
+ .. code-block:: python
614
+
615
+ import random, time
616
+ import deadpool
617
+
618
+
619
+ def work():
620
+ time.sleep(random.random() * 4.0)
621
+ print(".", end="", flush=True)
622
+ return 1
623
+
624
+
625
+ def cb(fut):
626
+ try:
627
+ assert fut.result() == 1
628
+ except deadpool.TimeoutError:
629
+ print("x", end="", flush=True)
630
+
631
+
632
+ def main():
633
+ with deadpool.Deadpool() as exe:
634
+ for _ in range(50):
635
+ exe.submit(work, deadpool_timeout=2.0).add_done_callback(cb)
636
+
637
+
638
+ if __name__ == "__main__":
639
+ main()
640
+ print()
641
+
642
+
643
+ With this callback-based design, we no longer have an accumulation of futures
644
+ in a list. We get the same kind of output as in the "typical example" from
645
+ earlier:
646
+
647
+ .. code-block:: bash
648
+
649
+ $ python examples/callbacks.py
650
+ .....xxx.xxxxxxxxx.........x..xxxxx.x....x.xxxxxxx
651
+
652
+
653
+ Speaking of callbacks, the customized ``Future`` class used by Deadpool
654
+ lets you set a callback for when the task begins executing on a real
655
+ system process. That can be configured like so:
656
+
657
+ .. code-block:: python
658
+
659
+ with deadpool.Deadpool() as exe:
660
+ f = exe.submit(work)
661
+
662
+ def cb(fut: deadpool.Future):
663
+ print(f"My task is running on process {fut.pid}")
664
+
665
+ f.add_pid_callback(cb)
666
+
667
+ Obviously, both kinds of callbacks can be added:
668
+
669
+ .. code-block:: python
670
+
671
+ with deadpool.Deadpool() as exe:
672
+ f = exe.submit(work)
673
+ f.add_pid_callback(lambda fut: f"Started on {fut.pid=}")
674
+ f.add_done_callback(lambda fut: f"Completed {fut.pid=}")
675
+
676
+ More about shutdown
677
+ -------------------
678
+
679
+ In the documentation for ProcessPoolExecutor_, the following function
680
+ signature is given for the shutdown_ method of the executor interface:
681
+
682
+ .. code-block:: python
683
+
684
+ shutdown(wait=True, *, cancel_futures=False)
685
+
686
+ I want to honor this, but it presents some difficulties because the
687
+ semantics of the ``wait`` and ``cancel_futures`` parameters need to be
688
+ somewhat different for Deadpool.
689
+
690
+ In Deadpool, this is what the combinations of those flags mean:
691
+
692
+ .. csv-table:: Shutdown flags
693
+ :header: ``wait``, ``cancel_futures``, ``effect``
694
+ :widths: 10, 10, 80
695
+ :align: left
696
+
697
+ ``True``, ``True``, "Wait for already-running tasks to complete; the
698
+ ``shutdown()`` call will unblock (return) when they're done. Cancel
699
+ all pending tasks that are in the submit queue, but have not yet started
700
+ running. The ``fut.cancelled()`` method will return ``True`` for such
701
+ cancelled tasks."
702
+ ``True``, ``False``, "Wait for already-running tasks to complete.
703
+ Pending tasks in the
704
+ submit queue that have not yet started running will *not* be cancelled, and
705
+ will all continue to execute. The ``shutdown()`` call will return only
706
+ after all submitted tasks have completed. "
707
+ ``False``, ``True``, "Already-running tasks **will be cancelled** and this
708
+ means the underlying subprocesses executing these tasks will receive
709
+ SIGKILL. Pending tasks on the submit queue that have not yet started
710
+ running will also be cancelled."
711
+ ``False``, ``False``, "This is a strange one. What to do if the caller
712
+ doesn't want to wait, but also doesn't want to cancel things? In this
713
+ case, already-running tasks will be allowed to complete, but pending
714
+ tasks on the submit queue will be cancelled. This is the same outcome as
715
+ as ``wait==True`` and ``cancel_futures==True``. An alternative design
716
+ might have been to allow all tasks, both running and pending, to just
717
+ keep going in the background even after the ``shutdown()`` call
718
+ returns. Does anyone have a use-case for this?"
719
+
720
+ If you're using ``Deadpool`` as a context manager, you might be wondering
721
+ how exactly to set these parameters in the ``shutdown`` call, since that
722
+ call is made for you automatically when the context manager exits.
723
+
724
+ For this, Deadpool provides additional parameters that can be provided
725
+ when creating the instance:
726
+
727
+ .. code-block:: python
728
+
729
+ # This is pseudocode
730
+ import deadpool
731
+
732
+ with deadpool.DeadPool(
733
+ shutdown_wait=True,
734
+ shutdown_cancel_futures=True
735
+ ):
736
+ fut = exe.submit(...)
737
+
738
+ Developer Workflow
739
+ ==================
740
+
741
+ nox
742
+ ---
743
+
744
+ This project uses ``nox``. Follow the instructions for installing
745
+ nox at their page, and then come back here.
746
+
747
+ While nox can be configured so that all the tools for each of
748
+ the tasks can be installed automatically when run, this takes
749
+ too much time and so I've decided that you should just have
750
+ the following tools in your environment, ready to go. They
751
+ do not need to be installed in the same venv or anything like
752
+ that. I've found a convenient way to do this is with ``pipx``.
753
+ For example, to install ``black`` using ``pipx`` you can do
754
+ the following:
755
+
756
+ .. code-block:: shell
757
+
758
+ $ pipx install black
759
+
760
+ You must do the same for ``isort`` and ``ruff``. See the following
761
+ sections for actually using ``nox`` to perform dev actions.
762
+
763
+ tests
764
+ -----
765
+
766
+ To run the tests:
767
+
768
+ .. code-block:: shell
769
+
770
+ $ nox -s test
771
+
772
+ To run tests for a particular version, and say with coverage:
773
+
774
+ .. code-block:: shell
775
+
776
+ $ nox -s testcov-3.11
777
+
778
+ To pass additional arguments to pytest, use the ``--`` separator:
779
+
780
+ .. code-block:: shell
781
+
782
+ $ nox -s testcov-3.11 -- -k test_deadpool -s <etc>
783
+
784
+ This is nonstandard above, but I customized the ``noxfile.py`` to
785
+ allow this.
786
+
787
+ style
788
+ -----
789
+
790
+ To apply style fixes, and check for any remaining lints,
791
+
792
+ .. code-block:: shell
793
+
794
+ $ nox -t style
795
+
796
+ docs
797
+ ----
798
+
799
+ The only docs currently are this README, which uses RST. Github
800
+ uses `docutils <https://docutils.sourceforge.io/docs/ref/rst/directives.html>`_
801
+ to render RST.
802
+
803
+ release
804
+ -------
805
+
806
+ This project uses flit to release the package to pypi. The whole
807
+ process isn't as automated as I would like, but this is what
808
+ I currently do:
809
+
810
+ 1. Ensure that ``main`` branch is fully up to date with all to
811
+ be released, and all the tests succeed.
812
+ 2. Change the ``__version__`` field in ``deadpool.py``. Flit
813
+ uses this to stamp the version.
814
+ 3. Verify that ``flit build`` succeeds. This will produce a
815
+ wheel in the ``dist/`` directory. You can inspect this
816
+ wheel to ensure it contains only what is necessary. This
817
+ wheel will be what is uploaded to PyPI.
818
+ 4. **Commit the changed ``__version__``**. Easy to forget this
819
+ step, resulting in multiple awkward releases to try to
820
+ get the state all correct again.
821
+ 5. Now create the git tag and push to github:
822
+
823
+ .. code-block:: shell
824
+
825
+ $ git tag YYYY.MM.patch
826
+ $ git push --tags origin main
827
+
828
+ 6. Now deploy to PyPI:
829
+
830
+ .. code-block:: shell
831
+
832
+ $ flit publish
833
+
834
+
835
+ .. _shutdown: https://docs.python.org/3/library/concurrent.futures.html?highlight=brokenprocesspool#concurrent.futures.Executor.shutdown
836
+ .. _ProcessPoolExecutor: https://docs.python.org/3/library/concurrent.futures.html?highlight=broken%20process%20pool#processpoolexecutor
837
+ .. _RuntimeError: https://github.com/noxdafox/pebble/issues/42#issuecomment-551245730
838
+ .. _OOM killer: https://en.wikipedia.org/wiki/Out_of_memory#Out_of_memory_management
839
+ .. _multiprocessing.Pool: https://docs.python.org/3.11/library/multiprocessing.html#multiprocessing.pool.Pool
840
+ .. _Apache 2.0: https://www.apache.org/licenses/LICENSE-2.0
841
+ .. _Affero GPL 3.0: https://www.gnu.org/licenses/agpl-3.0.html
842
+