
Issue #20169 has been updated by kjtsanaktsidis (KJ Tsanaktsidis). Well I did a bit more thinking about this. Firstly, I had a very unproductive morning trying to see if Mach exceptions on MacOS could catch invalid accesses in system calls, the way that `userfaultfd` can on Linux. Short answer: no. The second insight I had, though, is that if these objects are on the machine stack, then they actually should be pinned anyway. If you have a C extension and you take a pointer to some internal part of an object, it's already a requirement that you ensure the Ruby value gets spilled to the stack - i.e. you need to do something like this. ``` VALUE str = rb_sprintf("i am a cool string"); write(fd, RSTRING_PTR(str), RSTRING_LEN(str)); RB_GC_GUARD(str); // spill string to stack; NOT OPTIONAL! ``` Mabye we could change the GC compaction algorithm to not move _any_ objects on a page (and hence skip protecting the page) if any objects in the page are live on the machine stack? That _would_ substantially lessen the effectiveness of GC compaction I suppose, but we could maybe get that effectiveness back if userfaultfd is available? ---------------------------------------- Bug #20169: `GC.compact` can raises `EFAULT` on IO https://bugs.ruby-lang.org/issues/20169#change-106203 * Author: ko1 (Koichi Sasada) * Status: Open * Priority: Normal * Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN, 3.3: UNKNOWN ---------------------------------------- 1. `GC.compact` introduces read barriers to detect read accesses to the pages. 2. I/O operations release GVL to pass the control while their execution, and another thread can call `GC.compact` (or auto compact feature I guess, but not checked yet). 3. Call `write(ptr)` can return `EFAULT` when `GC.compact` is running because `ptr` can point read-barrier protected pages (embed strings). Reproducible steps: Apply the following patch to increase possibility: ```patch diff --git a/io.c b/io.c index f6cd2c1a56..83d67ba2dc 100644 --- a/io.c +++ b/io.c @@ -1212,8 +1212,12 @@ internal_write_func(void *ptr) } } + int cnt = 0; retry: - do_write_retry(write(iis->fd, iis->buf, iis->capa)); + for (; cnt < 1000; cnt++) { + do_write_retry(write(iis->fd, iis->buf, iis->capa)); + if (result <= 0) break; + } if (result < 0 && !iis->nonblock) { int e = errno; ``` Run the following code: ```ruby t1 = Thread.new{ 10_000.times.map{"#{_1}"}; GC.compact while true } t2 = Thread.new{ i=0 $stdout.write "<#{i+=1}>" while true } t2.join ``` and ``` $ make run (snip) 4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4>#<Thread:0x00007fa61b4dd758 ../../src/trunk/test.rb:3 run> terminated with exception (report_on_exception is true): ../../src/trunk/test.rb:5:in `write': Bad address @ io_write - <STDOUT> (Errno::EFAULT) from ../../src/trunk/test.rb:5:in `block in <main>' ../../src/trunk/test.rb:5:in `write': Bad address @ io_write - <STDOUT> (Errno::EFAULT) from ../../src/trunk/test.rb:5:in `block in <main>' make: *** [uncommon.mk:1383: run] Error 1 ``` I think this is why we get `EFAULT` on CI. To increase possibilities running many busy processes (`ruby -e 'loop{}'` for example) will help (and on CI environment there are such busy processes accidentally). -- https://bugs.ruby-lang.org/