uringmachine 0.22.0 → 0.22.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +4 -0
- data/benchmark/README.md +68 -102
- data/benchmark/chart.png +0 -0
- data/benchmark/common.rb +4 -0
- data/ext/um/um_const.c +1 -1
- data/grant-2025/tasks.md +1 -1
- data/lib/uringmachine/version.rb +1 -1
- metadata +1 -1
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: fbf0486feb49686a85ad162dd569ce125b4f168904e6f364e7c4358be234ca78
|
|
4
|
+
data.tar.gz: 27d0e3533c7d41f87b95376258992e37ebfa0d42c32ac0b46e7595939b3f1afa
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: c4bf87080fe32e145e334944cc1871d109f5b11c5c3cb413c9de3f44cf09b9554ac3fabeb54650c18f67ed96d6381a0290ae8e461ff37b9fef3d27351675ff68
|
|
7
|
+
data.tar.gz: a53addafe3007de3ad77f3e42e4ca3b3f90ae372af24b00beff1b5b486fa4c1f5d6c84de22788192b05328c14c7eeec4e602a77623a00c5c2f8de5bdbd8dca7d
|
data/CHANGELOG.md
CHANGED
data/benchmark/README.md
CHANGED
|
@@ -4,25 +4,26 @@ The following benchmarks measure the performance of UringMachine against stock
|
|
|
4
4
|
Ruby in a variety of scenarios. For each scenario, we compare three different
|
|
5
5
|
implementations:
|
|
6
6
|
|
|
7
|
-
-
|
|
7
|
+
- `Threads`: thread-based concurrency using the stock Ruby I/O and
|
|
8
8
|
synchronization classes.
|
|
9
9
|
|
|
10
|
-
-
|
|
11
|
-
|
|
12
|
-
Ruby I/O and synchronization classes.
|
|
10
|
+
- `ThreadPool`: thread pool consisting of 10 worker threads, receiving jobs
|
|
11
|
+
through a common queue.
|
|
13
12
|
|
|
14
|
-
-
|
|
15
|
-
|
|
13
|
+
- `Async epoll`: fiber-based concurrency with
|
|
14
|
+
[Async](https://github.com/socketry/async) fiber scheduler, using an epoll
|
|
15
|
+
selector.
|
|
16
16
|
|
|
17
|
-
-
|
|
18
|
-
|
|
17
|
+
- `Async uring`: fiber-based concurrency with Async fiber scheduler, using a
|
|
18
|
+
uring selector.
|
|
19
19
|
|
|
20
|
-
-
|
|
21
|
-
|
|
20
|
+
- `UM FS`: fiber-based concurrency with UringMachine fiber scheduler.
|
|
21
|
+
|
|
22
|
+
- `UM`: fiber-based concurrency using the UringMachine low-level API.
|
|
22
23
|
|
|
23
24
|
<img src="./chart.png">
|
|
24
25
|
|
|
25
|
-
## Observations
|
|
26
|
+
## Observations
|
|
26
27
|
|
|
27
28
|
- We see the stark difference between thread-based and fiber-based concurrency.
|
|
28
29
|
For I/O-bound workloads, there's really no contest - and that's exactly why
|
|
@@ -34,28 +35,37 @@ implementations:
|
|
|
34
35
|
C-extension.
|
|
35
36
|
|
|
36
37
|
- The UringMachine low-level API is faster to use in most cases, and its
|
|
37
|
-
performance advantage grows with the level of concurrency.
|
|
38
|
-
|
|
39
|
-
|
|
40
|
-
depending on the context. It remains to be seen how it affects performance in
|
|
41
|
-
real-world situations.
|
|
38
|
+
performance advantage grows with the level of concurrency. Interestingly, when
|
|
39
|
+
performing CPU-bound work, it seems slightly slightly slower. This should be
|
|
40
|
+
investigated.
|
|
42
41
|
|
|
43
42
|
- The [pg](https://github.com/ged/ruby-pg) gem supports the use of fiber
|
|
44
43
|
schedulers, and there too we see a marked performance advantage to using
|
|
45
44
|
fibers instead of threads.
|
|
46
45
|
|
|
46
|
+
According to these benchmarks, for I/O-bound scenarios the different fiber-based
|
|
47
|
+
implementations present a average speedup as follows:
|
|
48
|
+
|
|
49
|
+
|implementation|average factor|
|
|
50
|
+
|--------------|--------------|
|
|
51
|
+
|Async epoll |x2.36 |
|
|
52
|
+
|Async uring |x2.42 |
|
|
53
|
+
|UM FS |x2.85 |
|
|
54
|
+
|UM |x6.20 |
|
|
55
|
+
|
|
47
56
|
## 1. I/O - Pipe
|
|
48
57
|
|
|
49
58
|
50 groups, where in each group we create a pipe with a pair of threads/fibers
|
|
50
59
|
writing/reading 1KB of data to the pipe.
|
|
51
60
|
|
|
52
61
|
```
|
|
53
|
-
C=50x2
|
|
54
|
-
Threads
|
|
55
|
-
|
|
56
|
-
|
|
57
|
-
|
|
58
|
-
UM
|
|
62
|
+
C=50x2 user system total real
|
|
63
|
+
Threads 2.105002 2.671980 4.776982 ( 4.272842)
|
|
64
|
+
ThreadPool 4.818014 10.740555 15.558569 ( 7.070236)
|
|
65
|
+
Async epoll 1.118937 0.254803 1.373740 ( 1.374298)
|
|
66
|
+
Async uring 1.363248 0.270063 1.633311 ( 1.633696)
|
|
67
|
+
UM FS 0.746332 0.183006 0.929338 ( 0.929619)
|
|
68
|
+
UM 0.237816 0.328352 0.566168 ( 0.566265)
|
|
59
69
|
```
|
|
60
70
|
|
|
61
71
|
## 2. I/O - Socketpair
|
|
@@ -64,12 +74,13 @@ UM sqpoll 0.217577 0.634414 0.851991 ( 0.593531)
|
|
|
64
74
|
pair of threads/fibers writing/reading 1KB of data to the sockets.
|
|
65
75
|
|
|
66
76
|
```
|
|
67
|
-
|
|
68
|
-
Threads
|
|
69
|
-
|
|
70
|
-
|
|
71
|
-
|
|
72
|
-
UM
|
|
77
|
+
C=50x2 user system total real
|
|
78
|
+
Threads 2.068122 3.247781 5.315903 ( 4.295488)
|
|
79
|
+
ThreadPool 2.283882 3.461607 5.745489 ( 4.650422)
|
|
80
|
+
Async epoll 0.381400 0.846445 1.227845 ( 1.227983)
|
|
81
|
+
Async uring 0.472526 0.821467 1.293993 ( 1.294166)
|
|
82
|
+
UM FS 0.443023 0.734334 1.177357 ( 1.177576)
|
|
83
|
+
UM 0.116995 0.675997 0.792992 ( 0.793183)
|
|
73
84
|
```
|
|
74
85
|
|
|
75
86
|
## 3. Mutex - CPU-bound
|
|
@@ -78,12 +89,12 @@ UM sqpoll 0.220933 1.021997 1.242930 ( 0.976198)
|
|
|
78
89
|
threads/fibers locking the mutex and performing a Regexp match.
|
|
79
90
|
|
|
80
91
|
```
|
|
81
|
-
|
|
82
|
-
Threads
|
|
83
|
-
Async
|
|
84
|
-
|
|
85
|
-
UM
|
|
86
|
-
UM
|
|
92
|
+
C=20x10 user system total real
|
|
93
|
+
Threads 5.174998 0.024885 5.199883 ( 5.193211)
|
|
94
|
+
Async epoll 5.309793 0.000949 5.310742 ( 5.311217)
|
|
95
|
+
Async uring 5.341404 0.004860 5.346264 ( 5.346963)
|
|
96
|
+
UM FS 5.363719 0.001976 5.365695 ( 5.366254)
|
|
97
|
+
UM 5.351073 0.005986 5.357059 ( 5.357602)
|
|
87
98
|
```
|
|
88
99
|
|
|
89
100
|
## 4. Mutex - I/O-bound
|
|
@@ -93,81 +104,36 @@ start 10 worker threads/fibers locking the mutex and writing 1KB chunks to the
|
|
|
93
104
|
file.
|
|
94
105
|
|
|
95
106
|
```
|
|
96
|
-
|
|
97
|
-
Threads
|
|
98
|
-
Async
|
|
99
|
-
|
|
100
|
-
UM
|
|
101
|
-
UM
|
|
102
|
-
|
|
103
|
-
N=5 user system total real
|
|
104
|
-
Threads 0.214296 0.384078 0.598374 ( 0.467425)
|
|
105
|
-
Async FS 0.085820 0.158782 0.244602 ( 0.139766)
|
|
106
|
-
UM FS 0.064279 0.147278 0.211557 ( 0.117488)
|
|
107
|
-
UM pure 0.036478 0.182950 0.219428 ( 0.119745)
|
|
108
|
-
UM sqpoll 0.036929 0.347573 0.384502 ( 0.160814)
|
|
109
|
-
|
|
110
|
-
N=10 user system total real
|
|
111
|
-
Threads 0.435688 0.752219 1.187907 ( 0.924561)
|
|
112
|
-
Async FS 0.126573 0.303704 0.430277 ( 0.234900)
|
|
113
|
-
UM FS 0.128427 0.215204 0.343631 ( 0.184074)
|
|
114
|
-
UM pure 0.065522 0.359659 0.425181 ( 0.192385)
|
|
115
|
-
UM sqpoll 0.076810 0.477429 0.554239 ( 0.210087)
|
|
116
|
-
|
|
117
|
-
N=20 user system total real
|
|
118
|
-
Threads 0.830763 1.585299 2.416062 ( 1.868194)
|
|
119
|
-
Async FS 0.291823 0.644043 0.935866 ( 0.507887)
|
|
120
|
-
UM FS 0.226202 0.460401 0.686603 ( 0.362879)
|
|
121
|
-
UM pure 0.120524 0.616274 0.736798 ( 0.332182)
|
|
122
|
-
UM sqpoll 0.177150 0.849890 1.027040 ( 0.284069)
|
|
123
|
-
|
|
124
|
-
N=50 user system total real
|
|
125
|
-
Threads 2.124048 4.182537 6.306585 ( 4.878387)
|
|
126
|
-
Async FS 0.897134 1.268629 2.165763 ( 1.254624)
|
|
127
|
-
UM FS 0.733193 0.971821 1.705014 ( 0.933749)
|
|
128
|
-
UM pure 0.226431 1.504441 1.730872 ( 0.760731)
|
|
129
|
-
UM sqpoll 0.557310 2.107389 2.664699 ( 0.783992)
|
|
130
|
-
|
|
131
|
-
N=100 user system total real
|
|
132
|
-
Threads 4.420832 8.628756 13.049588 ( 10.264590)
|
|
133
|
-
Async FS 2.557661 2.532998 5.090659 ( 3.179336)
|
|
134
|
-
UM FS 2.262136 1.912055 4.174191 ( 2.523789)
|
|
135
|
-
UM pure 0.633897 2.793998 3.427895 ( 1.612989)
|
|
136
|
-
UM sqpoll 1.119460 4.193703 5.313163 ( 1.525968)
|
|
107
|
+
C=50x10 user system total real
|
|
108
|
+
Threads 2.042649 3.441547 5.484196 ( 4.328783)
|
|
109
|
+
Async epoll 0.810375 0.744084 1.554459 ( 1.554726)
|
|
110
|
+
Async uring 0.854985 1.129260 1.984245 ( 1.140749)
|
|
111
|
+
UM FS 0.686329 0.872376 1.558705 ( 0.845214)
|
|
112
|
+
UM 0.250370 1.323227 1.573597 ( 0.720928)
|
|
137
113
|
```
|
|
138
114
|
|
|
139
|
-
## 5.
|
|
115
|
+
## 5. Postgres client
|
|
140
116
|
|
|
141
|
-
|
|
142
|
-
threads/fibers that push items to the queue, and 10 consumer threads/fibers that
|
|
143
|
-
pull items from the queue.
|
|
117
|
+
C concurrent threads/fibers, each thread issuing SELECT query to a PG database.
|
|
144
118
|
|
|
145
119
|
```
|
|
146
|
-
|
|
147
|
-
Threads
|
|
148
|
-
Async
|
|
149
|
-
|
|
150
|
-
UM
|
|
151
|
-
UM sqpoll 2.044662 2.460344 4.505006 ( 2.261502)
|
|
120
|
+
C=50 user system total real
|
|
121
|
+
Threads 4.304292 1.358116 5.662408 ( 4.795725)
|
|
122
|
+
Async epoll 2.890160 0.432836 3.322996 ( 3.334350)
|
|
123
|
+
Async uring 2.818439 0.433896 3.252335 ( 3.252799)
|
|
124
|
+
UM FS 2.819371 0.443182 3.262553 ( 3.264606)
|
|
152
125
|
```
|
|
126
|
+
## 6. Queue
|
|
153
127
|
|
|
154
|
-
|
|
155
|
-
|
|
156
|
-
|
|
128
|
+
20 concurrent groups, where in each group we create a queue, start 5 producer
|
|
129
|
+
threads/fibers that push items to the queue, and 10 consumer threads/fibers that
|
|
130
|
+
pull items from the queue.
|
|
157
131
|
|
|
158
132
|
```
|
|
159
|
-
C=10
|
|
160
|
-
Threads
|
|
161
|
-
Async
|
|
162
|
-
|
|
163
|
-
|
|
164
|
-
|
|
165
|
-
Threads 1.652901 0.714299 2.367200 ( 2.014781)
|
|
166
|
-
Async FS 1.136826 0.212991 1.349817 ( 1.350544)
|
|
167
|
-
UM FS 1.084873 0.205865 1.290738 ( 1.291865)
|
|
168
|
-
|
|
169
|
-
C=50 user system total real
|
|
170
|
-
Threads 4.410604 1.804900 6.215504 ( 5.253016)
|
|
171
|
-
Async FS 2.918522 0.507981 3.426503 ( 3.427966)
|
|
172
|
-
UM FS 2.789549 0.537269 3.326818 ( 3.329802)
|
|
133
|
+
C=20x(5+10) user system total real
|
|
134
|
+
Threads 4.880983 0.207451 5.088434 ( 5.071019)
|
|
135
|
+
Async epoll 4.107208 0.006519 4.113727 ( 4.114227)
|
|
136
|
+
Async uring 4.206283 0.028974 4.235257 ( 4.235705)
|
|
137
|
+
UM FS 4.082394 0.001719 4.084113 ( 4.084522)
|
|
138
|
+
UM 4.099893 0.323569 4.423462 ( 4.424089)
|
|
173
139
|
```
|
data/benchmark/chart.png
CHANGED
|
Binary file
|
data/benchmark/common.rb
CHANGED
|
@@ -118,6 +118,8 @@ class UMBenchmark
|
|
|
118
118
|
fds = []
|
|
119
119
|
do_um(machine, fibers, fds)
|
|
120
120
|
machine.await_fibers(fibers)
|
|
121
|
+
puts "UM:"
|
|
122
|
+
p machine.metrics
|
|
121
123
|
fds.each { machine.close(it) }
|
|
122
124
|
end
|
|
123
125
|
|
|
@@ -128,6 +130,8 @@ class UMBenchmark
|
|
|
128
130
|
do_um(machine, fibers, fds)
|
|
129
131
|
machine.await_fibers(fibers)
|
|
130
132
|
fds.each { machine.close_async(it) }
|
|
133
|
+
puts "UM sqpoll:"
|
|
134
|
+
p machine.metrics
|
|
131
135
|
machine.snooze
|
|
132
136
|
end
|
|
133
137
|
end
|
data/ext/um/um_const.c
CHANGED
|
@@ -425,7 +425,7 @@ void um_define_net_constants(VALUE mod) {
|
|
|
425
425
|
DEF_CONST_INT(mod, SIGTSTP);
|
|
426
426
|
DEF_CONST_INT(mod, SIGCONT);
|
|
427
427
|
DEF_CONST_INT(mod, SIGCHLD);
|
|
428
|
-
DEF_CONST_INT(mod, SIGCLD);
|
|
428
|
+
// DEF_CONST_INT(mod, SIGCLD);
|
|
429
429
|
DEF_CONST_INT(mod, SIGTTIN);
|
|
430
430
|
DEF_CONST_INT(mod, SIGTTOU);
|
|
431
431
|
DEF_CONST_INT(mod, SIGIO);
|
data/grant-2025/tasks.md
CHANGED
|
@@ -152,7 +152,7 @@
|
|
|
152
152
|
- When finally idle, calls `Process.warmup`
|
|
153
153
|
- Starts replacing sibling workers with forked workers
|
|
154
154
|
see also: https://www.youtube.com/watch?v=kAW5O2dkSU8
|
|
155
|
-
- [ ] Each worker is single-threaded (except for
|
|
155
|
+
- [ ] Each worker is single-threaded (except for worker threads)
|
|
156
156
|
- [ ] Rack 3.0-compatible
|
|
157
157
|
see: https://github.com/socketry/protocol-rack
|
|
158
158
|
- [ ] Rails integration (Railtie)
|
data/lib/uringmachine/version.rb
CHANGED