[ruby-core:117300] [Ruby master Feature#19057] Hide implementation of `rb_io_t`.

Issue #19057 has been updated by ioquatix (Samuel Williams).
Why don't you reconsider the "nested public interface" approach?
My assessment of this approach is that it would require a rewrite of all internal code that accesses `rb_io_t` internals. Rewriting code isn't a problem, but it takes a lot of time and effort that I don't have in abundance right now. Additionally, it's still only a partial solution. One of the issues is accessing the file descriptor directly and the handling of `IO#close` from a different thread/fiber. Using the file descriptor directly can be problematic.
Ruby lost numerous users due to a never-ending stream of incompatibilities introduced every year.
Compatibility and progress are always something to balance out. I think on the whole, Ruby does a pretty decent job at it. Lack of forward progress also causes users to look elsewhere as the language stagnates. So this isn't just. a matter of "preserve compatibility at all costs", as those costs will be the same. The hard part is finding the middle ground of compatibility and progress. Both compatibility at any cost, and progress at any cost, are naive statements. ---------------------------------------- Feature #19057: Hide implementation of `rb_io_t`. https://bugs.ruby-lang.org/issues/19057#change-107442 * Author: ioquatix (Samuel Williams) * Status: Assigned * Assignee: ioquatix (Samuel Williams) * Target version: 3.4 ---------------------------------------- In order to make improvements to the IO implementation like <https://bugs.ruby-lang.org/issues/18455>, we need to add new fields to `struct rb_io_t`. By the way, ending types in `_t` is not recommended by POSIX, so I'm also trying to rename the internal implementation to drop `_t` where possible during this conversion. Anyway, we should try to hide the implementation of `struct rb_io`. Ideally, we don't expose any of it, but the problem is backwards compatibility. So, in order to remain backwards compatibility, we should expose some fields of `struct rb_io`, the most commonly used one is `fd` and `mode`, but several others are commonly used. There are many fields which should not be exposed because they are implementation details. ## Current proposal The current proposed change <https://github.com/ruby/ruby/pull/6511> creates two structs: ```c // include/ruby/io.h #ifndef RB_IO_T struct rb_io { int fd; // ... public fields ... }; #else struct rb_io; #endif // internal/io.h #define RB_IO_T struct rb_io { int fd; // ... public fields ... // ... private fields ... }; ``` However, we are not 100% confident this is safe according to the C specification. My experience is not sufficiently wide to say this is safe in practice, but it does look okay to both myself, and @Eregon + @tenderlovemaking have both given some kind of approval. That being said, maybe it's not safe. There are two alternatives: ## Hide all details We can make public `struct rb_io` completely invisible. ```c // include/ruby/io.h #define RB_IO_HIDDEN struct rb_io; int rb_ioptr_descriptor(struct rb_io *ioptr); // accessor for previously visible state. // internal/io.h struct rb_io { // ... all fields ... }; ``` This would only be forwards compatible, and code would need to feature detect like this: ```c #ifdef RB_IO_HIDDEN #define RB_IOPTR_DESCRIPTOR rb_ioptr_descriptor #else #define RB_IOPTR_DESCRIPTOR(ioptr) rb_ioptr_descriptor(ioptr) #endif ``` ## Nested public interface Alternatively, we can nest the public fields into the private struct: ```c // include/ruby/io.h struct rb_io_public { int fd; // ... public fields ... }; // internal/io.h #define RB_IO_T struct rb_io { struct rb_io_public public; // ... private fields ... }; ``` ## Considerations I personally think the "Hide all details" implementation is the best, but it's also the lest compatible. This is also what we are ultimately aiming for, whether we decide to take an intermediate "compatibility step" is up to us. I think "Nested public interface" is messy and introduces more complexity, but it might be slightly better defined than the "Current proposal" which might create undefined behaviour. That being said, all the tests are passing. -- https://bugs.ruby-lang.org/

"ioquatix (Samuel Williams) via ruby-core" <ruby-core@ml.ruby-lang.org> wrote:
Issue #19057 has been updated by ioquatix (Samuel Williams).
Why don't you reconsider the "nested public interface" approach?
My assessment of this approach is that it would require a rewrite of all internal code that accesses `rb_io_t` internals. Rewriting code isn't a problem, but it takes a lot of time and effort that I don't have in abundance right now. Additionally, it's still only a partial solution. One of the issues is accessing the file descriptor directly and the handling of `IO#close` from a different thread/fiber. Using the file descriptor directly can be problematic.
Again, why not the container_of solution I proposed? It shouldn't be that much for you since the relevant core internals could be divorced from extensions and significantly less work for stressed out and overworked downstream maintainers. Cross-thread IO#close or close(2) is a PITA to deal with portably if another thread is using that FD in any way. This applies to pure C projects, too, not just Ruby ones. Either: 1) ensure only one thread operates on a given FD and closes it; this is typical for stream FDs. 2) get all threads to abandon the FD before closing it. I do this for thread-shared packetized FDs which are analogous to Queue (O_DIRECT pipe, SOCK_SEQPACKET, listen sockets, kqueue/epoll FDs, etc...). Atomics + out-of-band (pthread_kill) signaling work for both cases. I've also used sentinel values in-band for packetized FDs (analogous to pushing `:stop' messages into a thread-shared Queue, like forcing a connect() in one thread to get blocking accept() to wake up; or adding an (eventfd||pipe) into a (kqueue||epoll) from another thread (it's 100% safe to EPOLL_CTL_ADD||EV_ADD from different threads and I've been abusing this safety for well over a decade)
Ruby lost numerous users due to a never-ending stream of incompatibilities introduced every year.
Compatibility and progress are always something to balance out. I think on the whole, Ruby does a pretty decent job at it. Lack of forward progress also causes users to look elsewhere as the language stagnates. So this isn't just. a matter of "preserve compatibility at all costs", as those costs will be the same. The hard part is finding the middle ground of compatibility and progress. Both compatibility at any cost, and progress at any cost, are naive statements.
git and Linux kernel do a pretty good job at maintaining compatibility while adding new features. OTOH some of the incompatibility proposals for Perl 5 have made me look into writing more C...
participants (2)
-
Eric Wong
-
ioquatix (Samuel Williams)