[ruby-core:113489] [Ruby master Bug#19642] Remove vectored read/write from `io.c`.

Issue #19642 has been reported by ioquatix (Samuel Williams). ---------------------------------------- Bug #19642: Remove vectored read/write from `io.c`. https://bugs.ruby-lang.org/issues/19642 * Author: ioquatix (Samuel Williams) * Status: Open * Priority: Normal * Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN ---------------------------------------- https://github.com/socketry/async/issues/228#issuecomment-1546789910 is a comment from the kernel developer who tells us that `writev` is always worse than `write` system call. A large amount of complexity in `io.c` comes from optional support from `writev`. So, I'd like to remove support for `writev`. I may try to measure the performance before/after. However it may not show much difference, except that the implementation in `io.c` can be much simpler. -- https://bugs.ruby-lang.org/

Issue #19642 has been updated by ioquatix (Samuel Williams). Assignee set to ioquatix (Samuel Williams) ---------------------------------------- Feature #19642: Remove vectored read/write from `io.c`. https://bugs.ruby-lang.org/issues/19642#change-103076 * Author: ioquatix (Samuel Williams) * Status: Open * Priority: Normal * Assignee: ioquatix (Samuel Williams) ---------------------------------------- https://github.com/socketry/async/issues/228#issuecomment-1546789910 is a comment from the kernel developer who tells us that `writev` is always worse than `write` system call. A large amount of complexity in `io.c` comes from optional support from `writev`. So, I'd like to remove support for `writev`. I may try to measure the performance before/after. However it may not show much difference, except that the implementation in `io.c` can be much simpler. -- https://bugs.ruby-lang.org/

Issue #19642 has been updated by ioquatix (Samuel Williams). https://github.com/ruby/ruby/pull/7825 ---------------------------------------- Feature #19642: Remove vectored read/write from `io.c`. https://bugs.ruby-lang.org/issues/19642#change-103142 * Author: ioquatix (Samuel Williams) * Status: Open * Priority: Normal * Assignee: ioquatix (Samuel Williams) ---------------------------------------- https://github.com/socketry/async/issues/228#issuecomment-1546789910 is a comment from the kernel developer who tells us that `writev` is always worse than `write` system call. A large amount of complexity in `io.c` comes from optional support from `writev`. So, I'd like to remove support for `writev`. I may try to measure the performance before/after. However it may not show much difference, except that the implementation in `io.c` can be much simpler. -- https://bugs.ruby-lang.org/

Issue #19642 has been updated by ioquatix (Samuel Williams). I would like to do a more comprehensive review of performance, but it seems minimal, even in the best possible circumstance: ``` | |compare-ruby|built-ruby| |:---------|-----------:|---------:| |io_write | 2.098| 2.089| | | 1.00x| -| ``` ---------------------------------------- Feature #19642: Remove vectored read/write from `io.c`. https://bugs.ruby-lang.org/issues/19642#change-103143 * Author: ioquatix (Samuel Williams) * Status: Open * Priority: Normal * Assignee: ioquatix (Samuel Williams) ---------------------------------------- https://github.com/socketry/async/issues/228#issuecomment-1546789910 is a comment from the kernel developer who tells us that `writev` is always worse than `write` system call. A large amount of complexity in `io.c` comes from optional support from `writev`. So, I'd like to remove support for `writev`. I may try to measure the performance before/after. However it may not show much difference, except that the implementation in `io.c` can be much simpler. -- https://bugs.ruby-lang.org/

Issue #19642 has been updated by tenderlovemaking (Aaron Patterson). I understand the concern of copying the `iovec`, but it seems like the overhead of making N calls to `write` would at some point be more expensive than copying the struct once. I guess under the hood `writev` may just be calling `write`, in which case copying the struct doesn't make sense. This patch simplifies the code, so if there's no significant perf difference then it seems fine to me. 😃 ---------------------------------------- Feature #19642: Remove vectored read/write from `io.c`. https://bugs.ruby-lang.org/issues/19642#change-103145 * Author: ioquatix (Samuel Williams) * Status: Open * Priority: Normal * Assignee: ioquatix (Samuel Williams) ---------------------------------------- https://github.com/socketry/async/issues/228#issuecomment-1546789910 is a comment from the kernel developer who tells us that `writev` is always worse than `write` system call. A large amount of complexity in `io.c` comes from optional support from `writev`. So, I'd like to remove support for `writev`. I may try to measure the performance before/after. However it may not show much difference, except that the implementation in `io.c` can be much simpler. -- https://bugs.ruby-lang.org/

Issue #19642 has been updated by ioquatix (Samuel Williams). The concern is less about the internal overhead. I'm sure different OS can optimise for different situations, e.g. in some cases I imagine `writev` can be faster than `write` if the system call overhead is large. The advice we received for Linux from Jens Axboe is that `write` is always more efficient that `writev`. In our case, our vectored write "public interface" doesn't change, i.e. `io.write(a, b, c)` still works and we could at any point opt to use `writev` again. The difference in performance is at best 5% in favour of `writev` but only in extreme cases and I suspect we can narrow this significantly, i.e. a small change to `rb_io_puts` changed the overhead from 5% difference to 2-3% which is barely measurable in real world programs. From an implementation point of view, the biggest convenience is: - Removal of 300 lines of optional code for handling `writev`. - The fiber scheduler will probably never support `writev` as it significantly increases the complexity of every piece of the interface with near zero performance improvement. - With the removal of `writev`, we could focus more on buffering internally to avoid multiple calls to write which is better than `writev`. I do care about the performance, if there is a big difference, we shouldn't consider this PR, but it also greatly simplifies the code, which I like. ---------------------------------------- Feature #19642: Remove vectored read/write from `io.c`. https://bugs.ruby-lang.org/issues/19642#change-103149 * Author: ioquatix (Samuel Williams) * Status: Open * Priority: Normal * Assignee: ioquatix (Samuel Williams) ---------------------------------------- https://github.com/socketry/async/issues/228#issuecomment-1546789910 is a comment from the kernel developer who tells us that `writev` is always worse than `write` system call. A large amount of complexity in `io.c` comes from optional support from `writev`. So, I'd like to remove support for `writev`. I may try to measure the performance before/after. However it may not show much difference, except that the implementation in `io.c` can be much simpler. -- https://bugs.ruby-lang.org/

Issue #19642 has been updated by usa (Usaku NAKAMURA). If I remember correctly, writev was introduced for atomic writes, not for performance. (I am neutral to remove writev.) ---------------------------------------- Feature #19642: Remove vectored read/write from `io.c`. https://bugs.ruby-lang.org/issues/19642#change-103150 * Author: ioquatix (Samuel Williams) * Status: Open * Priority: Normal * Assignee: ioquatix (Samuel Williams) ---------------------------------------- https://github.com/socketry/async/issues/228#issuecomment-1546789910 is a comment from the kernel developer who tells us that `writev` is always worse than `write` system call. A large amount of complexity in `io.c` comes from optional support from `writev`. So, I'd like to remove support for `writev`. I may try to measure the performance before/after. However it may not show much difference, except that the implementation in `io.c` can be much simpler. -- https://bugs.ruby-lang.org/

Issue #19642 has been updated by ioquatix (Samuel Williams). Thanks Nakamura-san for your feedback. According to POSIX:
Atomic/non-atomic: A write is atomic if the whole amount written in one operation is not interleaved with data from any other process. This is useful when there are multiple writers sending data to a single reader. Applications need to know how large a write request can be expected to be performed atomically. This maximum is called {PIPE_BUF}. This volume of IEEE Std 1003.1-2001 does not say whether write requests for more than {PIPE_BUF} bytes are atomic, but requires that writes of {PIPE_BUF} or fewer bytes shall be atomic.
I assume since `writev` is defined in terms of `write`, that `writev` of total size <= `PIPE_BUF` will be atomic. It is not defined what happens when the size is bigger than `PIPE_BUF`, but we can assume it is non-atomic. Since we already have a write lock in `IO`, writes should be atomic at the Ruby IO level. But not at the kernel level and not between processes that share the same pipes. ``` irb(main):001:0> $stderr.sync => true irb(main):002:0> $stdout.sync => true ``` It looks like `$stdout` and `$stderr` are both buffered internally. I think Ruby should guarantee buffered writes will be atomic up to PIPE_BUF. Unbuffered writes should have no atomicity guarantees. If it's not documented, maybe we can consider adding that. ---------------------------------------- Feature #19642: Remove vectored read/write from `io.c`. https://bugs.ruby-lang.org/issues/19642#change-103151 * Author: ioquatix (Samuel Williams) * Status: Open * Priority: Normal * Assignee: ioquatix (Samuel Williams) ---------------------------------------- https://github.com/socketry/async/issues/228#issuecomment-1546789910 is a comment from the kernel developer who tells us that `writev` is always worse than `write` system call. A large amount of complexity in `io.c` comes from optional support from `writev`. So, I'd like to remove support for `writev`. I may try to measure the performance before/after. However it may not show much difference, except that the implementation in `io.c` can be much simpler. -- https://bugs.ruby-lang.org/

"ioquatix (Samuel Williams) via ruby-core" <ruby-core@ml.ruby-lang.org> wrote:
``` irb(main):001:0> $stderr.sync => true irb(main):002:0> $stdout.sync => true ```
It looks like `$stdout` and `$stderr` are both buffered internally.
IO#sync==true means unbuffered.
I think Ruby should guarantee buffered writes will be atomic up to PIPE_BUF.
PIPE_BUF is only relevant for pipes/FIFOs (not regular files, sockets, etc..).
Unbuffered writes should have no atomicity guarantees.
Each write/writev syscall to regular files is atomic with O_APPEND. It's how multi-process servers can write to the same log file without interleaving lines. The limit is not PIPE_BUF, but (IIRC) roughly INT_MAX/SSIZE_MAX; It may be FS-dependent, but it's large enough to not matter on reasonable local FSes. Side notes: Also, any writev benchmarks you'd do should account for the common pattern of giant bodies prefixed with a small header. (e.g. HTTP/1.1 chunking, tnetstrings(*), etc). The 128 bytes you use in the benchmark is tiny and unrealistic. writev has high overhead with many small chunks (each iovec is 16 bytes), and most I/O size is far larger than 128 bytes. (*) https://tnetstrings.info/ I understand that writev is slow for the kernel, but (alloc + memcpy + GC/free) w/ giant strings is slow in userspace, too. Stuff like: io.write("#{buf.bytesize.to_s(16)}\r\n", buf, -"\r\n") When buf is a gigantic string (multiple KB or MB) is where I expect writev to avoid copies and garbage. Adding a buffer_offset (tentative name) arg to write_nonblock would make retrying non-blocking I/O much easier: n = 0 tot = ary.map(&:bytesize).sum while n < tot case w = io.write_nonblock(*ary, buffer_offset: n, exception: false) when :wait_readable, :wait_writable break # or schedule or whatever else n += w end end Perl's syswrite/read/sysread ops have the OFFSET arg for decades.

Issue #19642 has been updated by ioquatix (Samuel Williams). Status changed from Assigned to Closed I am no longer planning to do this. ---------------------------------------- Feature #19642: Remove vectored read/write from `io.c`. https://bugs.ruby-lang.org/issues/19642#change-110710 * Author: ioquatix (Samuel Williams) * Status: Closed * Assignee: ioquatix (Samuel Williams) ---------------------------------------- https://github.com/socketry/async/issues/228#issuecomment-1546789910 is a comment from the kernel developer who tells us that `writev` is always worse than `write` system call. A large amount of complexity in `io.c` comes from optional support from `writev`. So, I'd like to remove support for `writev`. I may try to measure the performance before/after. However it may not show much difference, except that the implementation in `io.c` can be much simpler. -- https://bugs.ruby-lang.org/
participants (5)
-
Eric Wong
-
ioquatix (Samuel Williams)
-
ioquatix (Samuel Williams)
-
tenderlovemaking (Aaron Patterson)
-
usa (Usaku NAKAMURA)