
Issue #20169 has been updated by kjtsanaktsidis (KJ Tsanaktsidis). Thought: we could probably remove the need for the read barrier if we swept the heap, and _then_ compacted. So we wouldn't free any objects in between moving things and updating refs. It would mean scanning the first half of each heap twice (once to sweep it, and once again to find the holes to fill with compacted objects). But maybe `GC.compact` is not that performance sensitive since it’s mostly used before forking multiprocess web servers? It would hurt perf for `GC.auto_compact`, but I don't really know if there's much to be done about it. I can't find any literature on conservative, incremental, moving GC's, which is what you'd want for `auto_compact`. Immix is conservative and moving, but it moves objects whilst stopping the world. Incremental or concurrent moving GC's like Shenandoah work by inserting compiler-provided read barriers around all accesses (which we can't do to C extensions) to fix up moved objects as they're loaded from the heap, and using compiler generated stackmaps (which we can't have for C extensions) to directly update existing moved references on the stack. CC @eightbitraptor since I'm pretty sure I've heard you talk about immix before, maybe you have thoughts? ---------------------------------------- Bug #20169: `GC.compact` can raises `EFAULT` on IO https://bugs.ruby-lang.org/issues/20169#change-106244 * Author: ko1 (Koichi Sasada) * Status: Open * Priority: Normal * Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN, 3.3: UNKNOWN ---------------------------------------- 1. `GC.compact` introduces read barriers to detect read accesses to the pages. 2. I/O operations release GVL to pass the control while their execution, and another thread can call `GC.compact` (or auto compact feature I guess, but not checked yet). 3. Call `write(ptr)` can return `EFAULT` when `GC.compact` is running because `ptr` can point read-barrier protected pages (embed strings). Reproducible steps: Apply the following patch to increase possibility: ```patch diff --git a/io.c b/io.c index f6cd2c1a56..83d67ba2dc 100644 --- a/io.c +++ b/io.c @@ -1212,8 +1212,12 @@ internal_write_func(void *ptr) } } + int cnt = 0; retry: - do_write_retry(write(iis->fd, iis->buf, iis->capa)); + for (; cnt < 1000; cnt++) { + do_write_retry(write(iis->fd, iis->buf, iis->capa)); + if (result <= 0) break; + } if (result < 0 && !iis->nonblock) { int e = errno; ``` Run the following code: ```ruby t1 = Thread.new{ 10_000.times.map{"#{_1}"}; GC.compact while true } t2 = Thread.new{ i=0 $stdout.write "<#{i+=1}>" while true } t2.join ``` and ``` $ make run (snip) 4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4><4>#<Thread:0x00007fa61b4dd758 ../../src/trunk/test.rb:3 run> terminated with exception (report_on_exception is true): ../../src/trunk/test.rb:5:in `write': Bad address @ io_write - <STDOUT> (Errno::EFAULT) from ../../src/trunk/test.rb:5:in `block in <main>' ../../src/trunk/test.rb:5:in `write': Bad address @ io_write - <STDOUT> (Errno::EFAULT) from ../../src/trunk/test.rb:5:in `block in <main>' make: *** [uncommon.mk:1383: run] Error 1 ``` I think this is why we get `EFAULT` on CI. To increase possibilities running many busy processes (`ruby -e 'loop{}'` for example) will help (and on CI environment there are such busy processes accidentally). -- https://bugs.ruby-lang.org/