uringmachine 0.19.1 → 0.21.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/.github/workflows/test.yml +3 -4
- data/CHANGELOG.md +32 -1
- data/TODO.md +0 -39
- data/examples/bm_fileno.rb +33 -0
- data/examples/bm_mutex.rb +85 -0
- data/examples/bm_mutex_single.rb +33 -0
- data/examples/bm_queue.rb +29 -29
- data/examples/bm_send.rb +2 -5
- data/examples/bm_snooze.rb +20 -42
- data/examples/bm_write.rb +4 -1
- data/examples/fiber_scheduler_demo.rb +15 -51
- data/examples/fiber_scheduler_fork.rb +24 -0
- data/examples/nc_ssl.rb +71 -0
- data/ext/um/extconf.rb +5 -15
- data/ext/um/um.c +310 -74
- data/ext/um/um.h +66 -29
- data/ext/um/um_async_op.c +1 -1
- data/ext/um/um_async_op_class.c +2 -2
- data/ext/um/um_buffer.c +1 -1
- data/ext/um/um_class.c +178 -31
- data/ext/um/um_const.c +51 -3
- data/ext/um/um_mutex_class.c +1 -1
- data/ext/um/um_op.c +37 -0
- data/ext/um/um_queue_class.c +1 -1
- data/ext/um/um_stream.c +5 -5
- data/ext/um/um_stream_class.c +3 -0
- data/ext/um/um_sync.c +28 -39
- data/ext/um/um_utils.c +59 -19
- data/grant-2025/journal.md +353 -0
- data/grant-2025/tasks.md +135 -0
- data/lib/uringmachine/fiber_scheduler.rb +316 -57
- data/lib/uringmachine/version.rb +1 -1
- data/lib/uringmachine.rb +6 -0
- data/test/test_fiber_scheduler.rb +640 -0
- data/test/test_stream.rb +2 -2
- data/test/test_um.rb +722 -54
- data/uringmachine.gemspec +5 -5
- data/vendor/liburing/.github/workflows/ci.yml +94 -1
- data/vendor/liburing/.github/workflows/test_build.c +9 -0
- data/vendor/liburing/configure +27 -0
- data/vendor/liburing/examples/Makefile +6 -0
- data/vendor/liburing/examples/helpers.c +8 -0
- data/vendor/liburing/examples/helpers.h +5 -0
- data/vendor/liburing/liburing.spec +1 -1
- data/vendor/liburing/src/Makefile +9 -3
- data/vendor/liburing/src/include/liburing/barrier.h +11 -5
- data/vendor/liburing/src/include/liburing/io_uring/query.h +41 -0
- data/vendor/liburing/src/include/liburing/io_uring.h +51 -0
- data/vendor/liburing/src/include/liburing/sanitize.h +16 -4
- data/vendor/liburing/src/include/liburing.h +458 -121
- data/vendor/liburing/src/liburing-ffi.map +16 -0
- data/vendor/liburing/src/liburing.map +8 -0
- data/vendor/liburing/src/sanitize.c +4 -1
- data/vendor/liburing/src/setup.c +7 -4
- data/vendor/liburing/test/232c93d07b74.c +4 -16
- data/vendor/liburing/test/Makefile +15 -1
- data/vendor/liburing/test/accept.c +2 -13
- data/vendor/liburing/test/bind-listen.c +175 -13
- data/vendor/liburing/test/conn-unreach.c +132 -0
- data/vendor/liburing/test/fd-pass.c +32 -7
- data/vendor/liburing/test/fdinfo.c +39 -12
- data/vendor/liburing/test/fifo-futex-poll.c +114 -0
- data/vendor/liburing/test/fifo-nonblock-read.c +1 -12
- data/vendor/liburing/test/futex.c +1 -1
- data/vendor/liburing/test/helpers.c +99 -2
- data/vendor/liburing/test/helpers.h +9 -0
- data/vendor/liburing/test/io_uring_passthrough.c +6 -12
- data/vendor/liburing/test/mock_file.c +379 -0
- data/vendor/liburing/test/mock_file.h +47 -0
- data/vendor/liburing/test/nop.c +2 -2
- data/vendor/liburing/test/nop32-overflow.c +150 -0
- data/vendor/liburing/test/nop32.c +126 -0
- data/vendor/liburing/test/pipe.c +166 -0
- data/vendor/liburing/test/poll-race-mshot.c +13 -1
- data/vendor/liburing/test/read-write.c +4 -4
- data/vendor/liburing/test/recv-mshot-fair.c +81 -34
- data/vendor/liburing/test/recvsend_bundle.c +1 -1
- data/vendor/liburing/test/resize-rings.c +2 -0
- data/vendor/liburing/test/ring-query.c +322 -0
- data/vendor/liburing/test/ringbuf-loop.c +87 -0
- data/vendor/liburing/test/ringbuf-read.c +4 -4
- data/vendor/liburing/test/runtests.sh +2 -2
- data/vendor/liburing/test/send-zerocopy.c +43 -5
- data/vendor/liburing/test/send_recv.c +103 -32
- data/vendor/liburing/test/shutdown.c +2 -12
- data/vendor/liburing/test/socket-nb.c +3 -14
- data/vendor/liburing/test/socket-rw-eagain.c +2 -12
- data/vendor/liburing/test/socket-rw-offset.c +2 -12
- data/vendor/liburing/test/socket-rw.c +2 -12
- data/vendor/liburing/test/sqe-mixed-bad-wrap.c +87 -0
- data/vendor/liburing/test/sqe-mixed-nop.c +82 -0
- data/vendor/liburing/test/sqe-mixed-uring_cmd.c +153 -0
- data/vendor/liburing/test/timestamp.c +56 -19
- data/vendor/liburing/test/vec-regbuf.c +2 -4
- data/vendor/liburing/test/wq-aff.c +7 -0
- metadata +37 -15
|
@@ -0,0 +1,353 @@
|
|
|
1
|
+
# 2025-11-14
|
|
2
|
+
|
|
3
|
+
## Call with Samuel
|
|
4
|
+
|
|
5
|
+
- I explained the tasks that I want to do:
|
|
6
|
+
|
|
7
|
+
1. FiberScheduler implementation for UringMachine
|
|
8
|
+
2. Async SSL I/O
|
|
9
|
+
3. Extend UringMachine & FiberScheduler with new functionality
|
|
10
|
+
|
|
11
|
+
- Samuel talked about two aspects:
|
|
12
|
+
|
|
13
|
+
- Experimentation.
|
|
14
|
+
- Integrating and improving on existing ecosystem, publicly visible changes to
|
|
15
|
+
interfaces.
|
|
16
|
+
|
|
17
|
+
So, improve on FiberScheduler interface, and show UringMachine as implementation.
|
|
18
|
+
|
|
19
|
+
Suggestion for tasks around FiberScheduler from Samuel:
|
|
20
|
+
|
|
21
|
+
1. Add Fiber::Scheduler#io_splice + IO-uring backing for IO.copy_stream
|
|
22
|
+
|
|
23
|
+
Summary:
|
|
24
|
+
|
|
25
|
+
Build an async-aware, zero-copy data-transfer path in Ruby by exposing Linux’s
|
|
26
|
+
splice(2) through the Fiber Scheduler, and wiring it up so IO.copy_stream can
|
|
27
|
+
take advantage of io_uring when available. Why it matters: Large file copies and
|
|
28
|
+
proxying workloads become dramatically faster and cheaper because the data never
|
|
29
|
+
touches user space. This gives Ruby a modern, high-performance primitive for
|
|
30
|
+
bulk I/O.
|
|
31
|
+
|
|
32
|
+
2. Add support for registered IO-uring buffers via IO::Buffer
|
|
33
|
+
|
|
34
|
+
Summary:
|
|
35
|
+
|
|
36
|
+
Integrate io_uring’s “registered buffers” feature with Ruby’s IO::Buffer,
|
|
37
|
+
allowing pre-allocated, pinned buffers to be reused across operations.
|
|
38
|
+
|
|
39
|
+
Why it matters:
|
|
40
|
+
|
|
41
|
+
Drastically reduces syscalls and buffer management overhead. Enables fully
|
|
42
|
+
zero-copy, high-throughput network servers and a more direct path to competitive
|
|
43
|
+
I/O performance.
|
|
44
|
+
|
|
45
|
+
3. Richer process APIs using pidfds (Fiber::Scheduler#process_open)
|
|
46
|
+
|
|
47
|
+
Summary:
|
|
48
|
+
|
|
49
|
+
Introduce pidfd-backed process primitives in Ruby so processes can be opened,
|
|
50
|
+
monitored, and waited on safely through the scheduler.
|
|
51
|
+
|
|
52
|
+
Why it matters:
|
|
53
|
+
|
|
54
|
+
Pidfds eliminate race conditions, improve cross-thread safety, and make process
|
|
55
|
+
management reliably asynchronous. This enables safer job-runners, supervisors,
|
|
56
|
+
and async orchestration patterns in Ruby.
|
|
57
|
+
|
|
58
|
+
4. Proper fork support for Fiber Scheduler (Fiber::Scheduler#process_fork)
|
|
59
|
+
|
|
60
|
+
Summary:
|
|
61
|
+
|
|
62
|
+
Define how fiber schedulers behave across fork: the child should start in a
|
|
63
|
+
clean state, with hooks to reinitialize or discard scheduler data safely.
|
|
64
|
+
|
|
65
|
+
Why it matters:
|
|
66
|
+
|
|
67
|
+
fork + async currently work inconsistently. This project makes forking
|
|
68
|
+
predictable, allowing libraries and apps to do post-fork setup (e.g., reconnect
|
|
69
|
+
I/O, restart loops) correctly and safely.
|
|
70
|
+
|
|
71
|
+
5. Async-aware IO#close via io_uring prep_close + scheduler hook
|
|
72
|
+
|
|
73
|
+
Summary:
|
|
74
|
+
|
|
75
|
+
Introduce a formal closing state in Ruby’s IO internals, add io_uring’s
|
|
76
|
+
prep_close support, and provide Fiber::Scheduler#io_close as an official hook.
|
|
77
|
+
|
|
78
|
+
Why it matters:
|
|
79
|
+
|
|
80
|
+
Today, IO#close can be slow or unsafe to call in async contexts because it must
|
|
81
|
+
run synchronously. This project allows deferred/batched closing, avoids races,
|
|
82
|
+
and modernizes Ruby’s internal I/O lifecycle.
|
|
83
|
+
|
|
84
|
+
GDB/LLDB extensions: https://github.com/socketry/toolbox
|
|
85
|
+
|
|
86
|
+
# 2025-11-17
|
|
87
|
+
|
|
88
|
+
## Work on io-event Uring selector
|
|
89
|
+
|
|
90
|
+
I added an implementation of `process_wait` using `io_uring_prep_waitid`. This
|
|
91
|
+
necessitates being able to create instances of `Process::Status`. For this, I've
|
|
92
|
+
submitted a PR for exposing `rb_process_status_new`:
|
|
93
|
+
https://github.com/ruby/ruby/pull/15213. Hopefully, this PR will be merged
|
|
94
|
+
before the release of Ruby 4.0.
|
|
95
|
+
|
|
96
|
+
# 2025-11-21
|
|
97
|
+
|
|
98
|
+
## Work on UringMachine Fiber Scheduler
|
|
99
|
+
|
|
100
|
+
I've finally made some progress on the UringMachine fiber scheduler. This was a
|
|
101
|
+
process of learning the mchanics of how the scheduler is integrated with the
|
|
102
|
+
Ruby I/O layer. Some interesting warts in the Ruby `IO` implementation:
|
|
103
|
+
|
|
104
|
+
- When you call `Kernel.puts`, the trailing newline character is actually
|
|
105
|
+
written separately, which can lead to unexpected output if for example you
|
|
106
|
+
have multiple fibers writing to STDOUT at the same time. To prevent this, Ruby
|
|
107
|
+
uses a mutex (per IO instance) to synchronize writes to the same IO.
|
|
108
|
+
- There are inconsistencies in how different kinds of IO objects are handled,
|
|
109
|
+
with regards to blocking/non-blocking operation
|
|
110
|
+
([O_NONBLOCK](https://linux.die.net/man/2/fcntl)):
|
|
111
|
+
|
|
112
|
+
- Files and standard I/O are blocking.
|
|
113
|
+
- Pipes are non-blocking.
|
|
114
|
+
- Sockets are non-blocking.
|
|
115
|
+
- OpenSSL sockets are non-blocking.
|
|
116
|
+
|
|
117
|
+
The problem is that for io_uring to function properly, the fds passed to it
|
|
118
|
+
should always be in blocking mode. To rectify this, I've added code to the
|
|
119
|
+
fiber scheduler implementation that makes sure the IO instance is blocking:
|
|
120
|
+
|
|
121
|
+
```ruby
|
|
122
|
+
def io_write(io, buffer, length, offset)
|
|
123
|
+
reset_nonblock(io)
|
|
124
|
+
@machine.write(io.fileno, buffer.get_string)
|
|
125
|
+
rescue Errno::EINTR
|
|
126
|
+
retry
|
|
127
|
+
end
|
|
128
|
+
|
|
129
|
+
def reset_nonblock(io)
|
|
130
|
+
return if @ios.key?(io)
|
|
131
|
+
|
|
132
|
+
@ios[io] = true
|
|
133
|
+
UM.io_set_nonblock(io, false)
|
|
134
|
+
end
|
|
135
|
+
```
|
|
136
|
+
|
|
137
|
+
- A phenomenon I've observed is that in some situations of multiple fibers doing
|
|
138
|
+
I/O, some of those I/O operations would raise an `EINTR`, which should mean
|
|
139
|
+
the I/O operation was interrupted because of a signal sent to the process.
|
|
140
|
+
Weird!
|
|
141
|
+
|
|
142
|
+
- There's some interesting stuff going on when calling `IO#close`. Apparently
|
|
143
|
+
there's a mutex involved, and I noticed two scheduler hooks are being called:
|
|
144
|
+
`#blocking_operation_wait` which means a blocking operation that should be ran
|
|
145
|
+
on a separate thread, and `#block`, which means a mutex is being locked. I
|
|
146
|
+
still need to figure out what is going on there and why it is so complex.
|
|
147
|
+
FWIW, UringMachine has a `#close_async` method which, as its name suggests,
|
|
148
|
+
submits a close operation, but does not wait for it to complete.
|
|
149
|
+
|
|
150
|
+
- I've added some basic documentation to the `FiberScheduler` class, and started
|
|
151
|
+
writing some tests. Now that I have a working fiber scheduler implementation
|
|
152
|
+
and I'm beginning to understand the mechanics of it, I can start TDD'ing...
|
|
153
|
+
|
|
154
|
+
## Work on io-event Uring selector
|
|
155
|
+
|
|
156
|
+
- I've submitted a [PR](https://github.com/socketry/io-event/pull/154) for using
|
|
157
|
+
`io_uring_prep_waitid` in the `process_wait` implementation. This relies on
|
|
158
|
+
having a recent Linux kernel (>=6.7) and the afore-mentioned Ruby
|
|
159
|
+
[PR](https://github.com/ruby/ruby/pull/15213) for exposing
|
|
160
|
+
`rb_process_status_new` being merged. Hopefully this will happen in time for
|
|
161
|
+
the Ruby 4.0 release.
|
|
162
|
+
|
|
163
|
+
# 2025-11-26
|
|
164
|
+
|
|
165
|
+
- Added some benchmarks for measuring mutex performance vs stock Ruby Mutex
|
|
166
|
+
class. It turns out the `UM#synchronize` was much slower than core Ruby
|
|
167
|
+
`Mutex#synchronize`. This was because the UM version was always performing a
|
|
168
|
+
futex wake before returning, even if no fiber was waiting to lock the mutex. I
|
|
169
|
+
rectified this by adding a `num_waiters` field to `struct um_mutex`, which
|
|
170
|
+
indicates the number of fibers currently waiting to lock the mutex, and
|
|
171
|
+
avoiding calling `um_futex_wake` if it's 0.
|
|
172
|
+
|
|
173
|
+
- I also noticed that the `UM::Mutex` and `UM::Queue` classes were marked as
|
|
174
|
+
`RUBY_TYPED_EMBEDDABLE`, which means the underlying `struct um_mutex` and
|
|
175
|
+
`struct um_queue` were subject to moving. Obviously, you cannot just move a
|
|
176
|
+
futex var while the kernel is potentially waiting on it to change. I fixed
|
|
177
|
+
this by removing the `RUBY_TYPED_EMBEDDABLE` flag. This is a possible
|
|
178
|
+
explanation for the occasional segfaults I've been seeing in Syntropy when
|
|
179
|
+
doing lots of cancelled `UM#shift` ops (watching for file changes). (commit 3b013407ff94f8849517b0fca19839d37e046915)
|
|
180
|
+
|
|
181
|
+
- Added support for `IO::Buffer` in all low-level I/O APIs, which also means the
|
|
182
|
+
fiber scheduler doesn't need to convert from `IO::Buffer` to strings in order
|
|
183
|
+
to invoke the UringMachine API. (commits
|
|
184
|
+
620680d9f80b6b46cb6037a6833d9cde5a861bcd,
|
|
185
|
+
16d2008dd052e9d73df0495c16d11f52bee4fd15,
|
|
186
|
+
4b2634d018fdbc52d63eafe6b0a102c0e409ebca,
|
|
187
|
+
bc9939f25509c0432a3409efd67ff73f0b316c61,
|
|
188
|
+
a9f38d9320baac3eeaf2fcb2143294ab8d115fe9)
|
|
189
|
+
|
|
190
|
+
- Added a custom `UM::Error` exception class raised on bad arguments or other
|
|
191
|
+
API misuse. I've also added a `UM::Stream::RESPError` exception class to be
|
|
192
|
+
instantiated on RESP errors. (commit 72a597d9f47d36b42977efa0f6ceb2e73a072bdf)
|
|
193
|
+
|
|
194
|
+
- I explored the fiber scheduler behaviour after forking. A fork done from a
|
|
195
|
+
thread where a scheduler was set will result in a main thread with the same
|
|
196
|
+
scheduler instantance. For the scheduler to work correctly after a fork, its
|
|
197
|
+
state must be reset. This is because sharing the same io_uring instance
|
|
198
|
+
between parent and child processes is not possible
|
|
199
|
+
(https://github.com/axboe/liburing/issues/612), and also because the child
|
|
200
|
+
process keeps only the fiber from which the fork was made as its main fiber
|
|
201
|
+
(the other fibers are lost).
|
|
202
|
+
|
|
203
|
+
So, the right thing to do here would be to add a `Fiber::Scheduler` hook that
|
|
204
|
+
will be invoked automatically by Ruby after a fork, and together with Samuel
|
|
205
|
+
I'll see if I can prepare a PR for that to be merged for the Ruby 4.0 release.
|
|
206
|
+
|
|
207
|
+
For the time being, I've added a `#post_fork` method to the UM fiber scheduler
|
|
208
|
+
which should be manually called after a fork. (commit
|
|
209
|
+
2c7877385869c6acbdd8354e2b2909cff448651b)
|
|
210
|
+
|
|
211
|
+
- Added two new low-level APIs for waiting on processes, instead of
|
|
212
|
+
`UM#waitpid`, using the io_uring version of `waitid`. The vanilla version
|
|
213
|
+
`UM#waitid` returns an array containing the terminated process pid, exit
|
|
214
|
+
status and code. The `UM#waitid_status` method returns a `Process::Status`
|
|
215
|
+
with the pid and exit status. This method is present only if the
|
|
216
|
+
`rb_process_status_new` function is available (see above).
|
|
217
|
+
|
|
218
|
+
- Implemented `FiberScheduler#process_wait` hook using `#waitid_status`.
|
|
219
|
+
|
|
220
|
+
- For the sake of completeness, I also added `UM.pidfd_open` and
|
|
221
|
+
`UM.pidfd_send_signal` for working with PID. A simple example:
|
|
222
|
+
|
|
223
|
+
```ruby
|
|
224
|
+
child_pid = fork { ... }
|
|
225
|
+
fd = UM.pidfd_open(child_pid)
|
|
226
|
+
...
|
|
227
|
+
UM.pidfd_send_signal(fd, UM::SIGUSR1)
|
|
228
|
+
...
|
|
229
|
+
pid2, status = machine.waitid(P_PIDFD, fd, UM::WEXITED)
|
|
230
|
+
```
|
|
231
|
+
|
|
232
|
+
# 2025-11-28
|
|
233
|
+
|
|
234
|
+
- On Samuel's suggestions, I've submitted a
|
|
235
|
+
[PR](https://github.com/ruby/ruby/pull/15342) for adding a
|
|
236
|
+
`Fiber::Scheduler#process_fork` hook that is automatically invoked after a
|
|
237
|
+
fork. This is in continuation to the `#post_fork` method. I still have a lot
|
|
238
|
+
to learn about working with the Ruby core code, but I'm really excited about
|
|
239
|
+
the possibility of this PR (and the [previous
|
|
240
|
+
one](https://github.com/ruby/ruby/pull/15213) as well) getting merged in time
|
|
241
|
+
for the Ruby 4.0 release.
|
|
242
|
+
- Added a bunch of tests for `UM::FiberScheduler`: socket I/O, file I/O, mutex,
|
|
243
|
+
queue, waiting for threads. In the process I discovered a lots of things that
|
|
244
|
+
can be improved in the way Ruby invokes the fiber scheduler.
|
|
245
|
+
|
|
246
|
+
- For regular files, Ruby assumes file I/O can never be non-blocking (or
|
|
247
|
+
async), and thus invokes the `#blocking_operation_wait` hook in order to
|
|
248
|
+
perform the I/O in a separate thread. With io_uring, of course, file I/O
|
|
249
|
+
*is* asynchronous.
|
|
250
|
+
- For sockets there are no specialized hooks, like `#socket_send` etc.
|
|
251
|
+
Instead, Ruby makes the socket fd's non-blocking and invokes `#io_wait` to
|
|
252
|
+
wait for the socket to be ready.
|
|
253
|
+
|
|
254
|
+
I find it interesting how io_uring breaks a lot of assumptions about how I/O
|
|
255
|
+
should be done.
|
|
256
|
+
|
|
257
|
+
# 2025-12-03
|
|
258
|
+
|
|
259
|
+
- Samuel and me continued discussing the behavior of the fiber scheduler after a
|
|
260
|
+
fork. After talking it through, we decided the best course of action would be
|
|
261
|
+
to remove the fiber scheduler after a fork, rather than to introduce a
|
|
262
|
+
`process_fork` hook. This is a safer choice, since a scheduler risks carrying
|
|
263
|
+
over some of its state across a fork, leading to unexpected behavior.
|
|
264
|
+
|
|
265
|
+
Another problem I uncovered is that if a fork is done from a non-blocking
|
|
266
|
+
fiber, the main fiber of the forked process (which "inherits" the forking
|
|
267
|
+
fiber) stays in non-blocking mode, which also may lead to unexpected behavior,
|
|
268
|
+
since the main fiber of all Ruby threads should be in blocking mode.
|
|
269
|
+
|
|
270
|
+
So I submitted a new [PR](https://github.com/ruby/ruby/pull/15385) that
|
|
271
|
+
corrects these two problems.
|
|
272
|
+
|
|
273
|
+
- I mapped the remaining missing hooks in the UringMachine fiber scheduler
|
|
274
|
+
implementation, and made the tests more robust by checking that the different
|
|
275
|
+
scheduler hooks were actually being called.
|
|
276
|
+
|
|
277
|
+
- Continued implementing the missing fiber scheduler hooks: `#fiber_interrupt`,
|
|
278
|
+
`#address_resolve`, `#timeout_after`. For the most part, they were simple to
|
|
279
|
+
implement. I probably spent most of my time figuring out how to test these,
|
|
280
|
+
rather than implementing them. Most of the hooks involve just a few lines of
|
|
281
|
+
code, with many of them consisting of a single line of code, calling into the
|
|
282
|
+
relevant UringMachine low-level API.
|
|
283
|
+
|
|
284
|
+
- Implemented the `#io_select` hook, which involved implementing a low-level
|
|
285
|
+
`UM#select` method. This method took some effort to implement, since it needs
|
|
286
|
+
to handle an arbitrary number of file descriptors to check for readiness. We
|
|
287
|
+
need to create a separate SQE for each fd we want to poll. When one or more
|
|
288
|
+
CQEs arrive for polled fd's, we also need to cancel all poll operations that
|
|
289
|
+
have not completed.
|
|
290
|
+
|
|
291
|
+
Since in many cases, `IO.select` is called with just a single IO, I also added
|
|
292
|
+
a special-case implementation of `UM#select` that specifically handles a
|
|
293
|
+
single fd.
|
|
294
|
+
|
|
295
|
+
# 2025-12-04
|
|
296
|
+
|
|
297
|
+
- Implemented a worker pool for performing blocking operations in the scheduler.
|
|
298
|
+
Up until now, each scheduler started their own worker thread for performing
|
|
299
|
+
blocking operations for use in the `#blocking_operation_wait` hook. The new
|
|
300
|
+
implementation uses a worker thread pool shared by all schedulers, with a
|
|
301
|
+
worker count limited to CPU count. Workers are started when needed.
|
|
302
|
+
|
|
303
|
+
I also added an optional `entries` argument to set the SQE and CQE buffer
|
|
304
|
+
sizes when starting a new `UringMachine` instance. The default size is 4096
|
|
305
|
+
SQE entries (liburing by default makes the CQE buffer size double that of the
|
|
306
|
+
SQE buffer). The blocking operations worker threads specify a value of 4 since
|
|
307
|
+
they only use their UringMachine instance for popping jobs off the job queue
|
|
308
|
+
and pushing the blocking operation result back to the scheduler.
|
|
309
|
+
|
|
310
|
+
- Added support for `file_offset` argument in `UM#read` and `UM#write` in
|
|
311
|
+
preparation for implementing the `#io_pread` and `#io_pwrite` hooks. The
|
|
312
|
+
`UM#write_async` API, which permits writing to a file descriptor without
|
|
313
|
+
waiting for the operation to complete, got support for specifying `length` and
|
|
314
|
+
`file_offset` arguments as well. In addition, `UM#write` and `UM#write_async`
|
|
315
|
+
got short-circuit logic for writes with a length of 0.
|
|
316
|
+
|
|
317
|
+
- Added support for specifying buffer offset in `#io_read` and `#io_write`
|
|
318
|
+
hooks.
|
|
319
|
+
|
|
320
|
+
- Added support for timeout in `#block`, `#io_read` and `#io_write` hooks.
|
|
321
|
+
|
|
322
|
+
# 2025-12-05
|
|
323
|
+
|
|
324
|
+
- I found and fixed a problem with how `futex_wake` was done in the low-level
|
|
325
|
+
UringMachine code handling mutexes and queues. This fixed a deadlock in the
|
|
326
|
+
scheduler background worker pool where clients of the pool where not properly
|
|
327
|
+
woken after the submitted operation was done.
|
|
328
|
+
|
|
329
|
+
- I finished work on the `#io_pread` and `#io_pwrite` hooks. Unfortunately, the
|
|
330
|
+
test for `#io_pwrite` consistently hangs (not on `IO#pwrite` itself, rather on
|
|
331
|
+
closing the file.) With Samuel's help, hopefully we'll find a solution...
|
|
332
|
+
|
|
333
|
+
- With those two last hooks, the fiber scheduler implementation is now feature
|
|
334
|
+
complete! While I have written test cases for the different fiber scheduler
|
|
335
|
+
hooks, I'd like to add more tests - and especially tests that exercise
|
|
336
|
+
multiple hooks, tests with high concurrency, and integration tests where I
|
|
337
|
+
check how the fiber scheduler plays with Ruby APIs like `Net::HTTP` and the
|
|
338
|
+
`socket` API in general.
|
|
339
|
+
|
|
340
|
+
# 2025-12-06
|
|
341
|
+
|
|
342
|
+
- Samuel has found the issue with pwrite (it turns out the the `#io_pwrite` hook was being invoked with the GVL released), and [fixed it](https://github.com/ruby/ruby/pull/15428). So now `#pwrite` works correctly with a fiber scheduler!
|
|
343
|
+
|
|
344
|
+
- I followed Samuel's suggestion and incorporated some debug logging into the
|
|
345
|
+
extension code interfacing with liburing, in order to facilitate debugging
|
|
346
|
+
when issues are encountered.
|
|
347
|
+
|
|
348
|
+
- Added support for [SQPOLL
|
|
349
|
+
mode](https://unixism.net/loti/tutorial/sq_poll.html) when setting up a
|
|
350
|
+
UringMachine instance. It's not clear to me what are the performance
|
|
351
|
+
implications of that, but I'll try to make some time to check this against
|
|
352
|
+
[TP2](https://github.com/noteflakes/tp2), a UringMachine-based web server I'm
|
|
353
|
+
currently using in a bunch of projects.
|
data/grant-2025/tasks.md
ADDED
|
@@ -0,0 +1,135 @@
|
|
|
1
|
+
- [v] io-event
|
|
2
|
+
- [v] Make PR to use io_uring_prep_waitid for kernel version >= 6.7
|
|
3
|
+
|
|
4
|
+
- [ ] UringMachine low-level API
|
|
5
|
+
- [v] Add support for IO::Buffer in UM API.
|
|
6
|
+
- [v] Add `UM::Error` class to be used instead of RuntimeError
|
|
7
|
+
- [v] Add optional ring size argument to `UM.new` (for example, a the
|
|
8
|
+
worker thread for the scheduler `blocking_operation_wait` hook does not need
|
|
9
|
+
a lot of depth, so you can basically do `UM.new(4)`)
|
|
10
|
+
- [v] Add debugging code suggested by Samuel
|
|
11
|
+
- [v] Add support for SQPOLL
|
|
12
|
+
https://unixism.net/loti/tutorial/sq_poll.html
|
|
13
|
+
|
|
14
|
+
- [ ] Add support for using IO::Buffer in association with io_uring registered
|
|
15
|
+
buffers / buffer rings
|
|
16
|
+
- [ ] Set `IOSQE_CQE_SKIP_SUCCESS` flag for `#close_async` and `#write_async`
|
|
17
|
+
- [ ] In `UM#spin` always start fibers as non-blocking.
|
|
18
|
+
- [ ] Add some way to measure fiber CPU time.
|
|
19
|
+
https://github.com/socketry/async/issues/428
|
|
20
|
+
|
|
21
|
+
- [ ] UringMachine Fiber::Scheduler implementation
|
|
22
|
+
- [v] Check how scheduler interacts with `fork`.
|
|
23
|
+
- [v] Implement `process_wait` (with `rb_process_status_new`)
|
|
24
|
+
- [v] Implement `fiber_interrupt` hook
|
|
25
|
+
- [v] Add `#address_resolve` hook with same impl as Async:
|
|
26
|
+
https://github.com/socketry/async/blob/ea8b0725042b63667ea781d4d011786ca3658256/lib/async/scheduler.rb#L285-L296
|
|
27
|
+
- [v] Implement other hooks:
|
|
28
|
+
- [v] `#timeout_after`
|
|
29
|
+
https://github.com/socketry/async/blob/ea8b0725042b63667ea781d4d011786ca3658256/lib/async/scheduler.rb#L631-L644
|
|
30
|
+
- [v] `#io_pread`
|
|
31
|
+
- [v] `#io_pwrite`
|
|
32
|
+
- [v] `#io_select`
|
|
33
|
+
- [v] Add timeout handling in different I/O hooks
|
|
34
|
+
- [v] Experiment more with fork:
|
|
35
|
+
- [v] what happens to schedulers on other threads (those that don't make it post-fork)
|
|
36
|
+
- do they get GC'd?
|
|
37
|
+
- do they get closed (`#scheduler_close` called)?
|
|
38
|
+
- are they freed cleanly (at least for UM)?
|
|
39
|
+
|
|
40
|
+
```ruby
|
|
41
|
+
class S
|
|
42
|
+
def respond_to?(sym) = true
|
|
43
|
+
end
|
|
44
|
+
o = S.new
|
|
45
|
+
ObjectSpace.define_finalizer(o, ->(*){ puts 'scheduler finalized' })
|
|
46
|
+
t1 = Thread.new { Fiber.set_scheduler(o); sleep }
|
|
47
|
+
t2 = Thread.new {
|
|
48
|
+
fork { p(t1:, t2:) }
|
|
49
|
+
GC.start
|
|
50
|
+
}
|
|
51
|
+
|
|
52
|
+
# output:
|
|
53
|
+
# scheduler finalized
|
|
54
|
+
```
|
|
55
|
+
|
|
56
|
+
So, apparently there's no problem!
|
|
57
|
+
- [v] Implement multi-thread worker pool for `blocking_operation_wait`
|
|
58
|
+
Single thread pool at class level, shared by all schedulers
|
|
59
|
+
With worker count according to CPU count
|
|
60
|
+
- [v] Test working with non-blocking files, it should be fine, and we shouldn't need to reset `O_NONBLOCK`.
|
|
61
|
+
- [v] Implement timeouts (how do timeouts interact with blocking ops?)
|
|
62
|
+
- [ ] Implement `#yield` hook (https://github.com/ruby/ruby/pull/14700)
|
|
63
|
+
- [ ] Finish documentation for the `FiberScheduler` class.
|
|
64
|
+
|
|
65
|
+
- [v] tests:
|
|
66
|
+
- [v] Wrap the scheduler interface such that we can verify that specific
|
|
67
|
+
hooks were called. Add asserts for called hooks for all tests.
|
|
68
|
+
- [v] Sockets (only io_wait)
|
|
69
|
+
- [v] Files
|
|
70
|
+
- [v] Mutex / Queue
|
|
71
|
+
- [v] Thread.join
|
|
72
|
+
- [v] Process.wait
|
|
73
|
+
- [v] fork
|
|
74
|
+
- [v] system / exec / etc.
|
|
75
|
+
- [v] popen
|
|
76
|
+
- [ ] "Integration tests"
|
|
77
|
+
- [ ] queue: multiple concurrent readers / writers
|
|
78
|
+
- [ ] net/http test: ad-hoc HTTP/1.1 server + `Net::HTTP` client
|
|
79
|
+
- [ ] sockets: echo server + many clients
|
|
80
|
+
- [ ] IO - all methods!
|
|
81
|
+
|
|
82
|
+
- [ ] Benchmarks
|
|
83
|
+
- [ ] UM queue / Ruby queue (threads) / Ruby queue with UM fiber scheduler
|
|
84
|
+
- [ ] UM mutex / Ruby mutex (threads) / Ruby mutex with UM fiber scheduler
|
|
85
|
+
- [ ] Pipe IO raw UM / Ruby threaded / Ruby with UM fiber scheduler
|
|
86
|
+
- [ ] Socket IO (with socketpair) raw UM / Ruby threaded / Ruby with UM fiber scheduler
|
|
87
|
+
- [ ] Measure CPU (thread) time usage for above examples
|
|
88
|
+
|
|
89
|
+
- run each version 1M times
|
|
90
|
+
- measure total real time, total CPU time
|
|
91
|
+
|
|
92
|
+
```ruby
|
|
93
|
+
real_time = Process.clock_gettime(Process::CLOCK_MONOTONIC)
|
|
94
|
+
cpu_time = Process.clock_gettime(Process::CLOCK_THREAD_CPUTIME_ID)
|
|
95
|
+
```
|
|
96
|
+
|
|
97
|
+
- my hunch is we'll be able to show with io_uring real_time is less,
|
|
98
|
+
while cpu_time is more. But it's just a hunch.
|
|
99
|
+
|
|
100
|
+
- [ ] Ruby Fiber::Scheduler interface
|
|
101
|
+
- [ ] Make a PR for resetting the scheduler and resetting the fiber non-blocking flag.
|
|
102
|
+
- [ ] Missing hook for close
|
|
103
|
+
- [ ] Missing hooks for send/recv/sendmsg/recvmsg
|
|
104
|
+
- [ ] Writes to a file (including `IO.write`) do not invoke `#io_write` (because writes to files cannot be non-blocking?) Instead, `blocking_operation_wait` is invoked.
|
|
105
|
+
|
|
106
|
+
- [ ] SSL
|
|
107
|
+
- [ ] openssl gem: custom BIO?
|
|
108
|
+
|
|
109
|
+
- curl: https://github.com/curl/curl/blob/5f4cd4c689c822ce957bb415076f0c78e5f474b5/lib/vtls/openssl.c#L786-L803
|
|
110
|
+
|
|
111
|
+
- [ ] UringMachine website
|
|
112
|
+
- [ ] domain: uringmachine.dev
|
|
113
|
+
- [ ] logo: ???
|
|
114
|
+
- [ ] docs (similar to papercraft docs)
|
|
115
|
+
|
|
116
|
+
- [ ] Uma - web server
|
|
117
|
+
- [ ] child process workers
|
|
118
|
+
- [ ] reforking (following https://github.com/Shopify/pitchfork)
|
|
119
|
+
see also: https://byroot.github.io/ruby/performance/2025/03/04/the-pitchfork-story.html
|
|
120
|
+
- Monitor worker memory usage - how much is shared
|
|
121
|
+
- Choose worker with most served request count as "mold" for next generation
|
|
122
|
+
- Perform GC out of band, preferably when there are no active requests
|
|
123
|
+
https://railsatscale.com/2024-10-23-next-generation-oob-gc/
|
|
124
|
+
- When a worker is promoted to "mold", it:
|
|
125
|
+
- Stops `accept`ing requests
|
|
126
|
+
- When finally idle, calls `Process.warmup`
|
|
127
|
+
- Starts replacing sibling workers with forked workers
|
|
128
|
+
see also: https://www.youtube.com/watch?v=kAW5O2dkSU8
|
|
129
|
+
- [ ] Each worker is single-threaded (except for auxiliary threads)
|
|
130
|
+
- [ ] Rack 3.0-compatible
|
|
131
|
+
see: https://github.com/socketry/protocol-rack
|
|
132
|
+
- [ ] Rails integration (Railtie)
|
|
133
|
+
see: https://github.com/socketry/falcon
|
|
134
|
+
- [ ] Benchmarks
|
|
135
|
+
- [ ] Add to the TechEmpower bencchmarks
|