[ruby-core:124518] [Ruby Bug#21836] RUBY_MN_THREADS deadlock and sleep issues
Issue #21836 has been reported by jpl-coconut (Jacob Lacouture). ---------------------------------------- Bug #21836: RUBY_MN_THREADS deadlock and sleep issues https://bugs.ruby-lang.org/issues/21836 * Author: jpl-coconut (Jacob Lacouture) * Status: Open * ruby -v: ruby 3.4.7 (2025-11-08) +PRISM [aarch64-linux] * Backport: 3.2: UNKNOWN, 3.3: UNKNOWN, 3.4: UNKNOWN, 4.0: UNKNOWN ---------------------------------------- I created a benchmark for the purpose of testing [a fix](https://github.com/ruby/ruby/pull/15840) for [Issue 21685](https://bugs.ruby-lang.org/issues/21685) The benchmark is inline below and saved as 100usec.rb. It should always take just a bit over 10 seconds, regardless of how many cores or threads are assigned. It should look like this: ```
time taskset --cpu-list 1 ./ruby 100usec.rb 1 real 0m10.130s user 0m0.094s sys 0m0.218s
It works fine with MN_THREADS disabled. However, with RUBY_MN_THREADS=1, I see two separate issues.
### Issue #1 is that when the number of cores is set to more than 1, the benchmark completes too fast.
These results are consistent across the ruby versions I tried. I did a little debugging and found that the call to `sleep(0.001)` (1ms) is returning after only 0.000005 seconds (5us).
time RUBY_MN_THREADS=1 taskset --cpu-list 1,2 ./ruby 100usec.rb 1 real 0m0.359s user 0m0.075s sys 0m0.134s
### Issue #2 shows up when the number of cores is limited to 1.
- On ruby 3.4.7 it runs very slow, about 6x slower than expected.
time RUBY_MN_THREADS=1 taskset --cpu-list 1 ./ruby 100usec.rb 1 real 1m2.277s user 0m0.249s sys 0m0.408s
- On ruby 4.0.0-preview2 it usually deadlocks, but sometimes it segfaults.
time RUBY_MN_THREADS=1 taskset --cpu-list 1 ./ruby ruby/100usec.rb 1 [BUG] unreachable ruby 4.0.0preview2 (2025-11-17 master 4fa6e9938c) +MN +PRISM [aarch64-linux]
-- Control frame information ----------------------------------------------- -- Threading information --------------------------------------------------- Total ractor count: 1 Ruby thread count for this ractor: 1 -- C level backtrace information ------------------------------------------- [BUG] Illegal instruction at 0x0000ffff93777250 ruby 4.0.0preview2 (2025-11-17 master 4fa6e9938c) +MN +PRISM [aarch64-linux] Crashed while printing bug report Illegal instruction ``` or ```
time RUBY_MN_THREADS=1 taskset --cpu-list 1 ./ruby ruby/100usec.rb 1 <<DEADLOCK>>
### The benchmark code (100usec.rb)
``` ruby
ITRCOUNT = 10000
def inner_test
r, w = IO.pipe
reader = Thread.new do
ITRCOUNT.times.map {|i|
r.getbyte
}
end
ITRCOUNT.times.map {|i|
sleep 0.0001
w.write('0')
}
reader.join
end
def outer_test(count)
count.times.map{|j|
Thread.new do
inner_test
end
}.each{|t| t.join}
end
outer_test(ARGV[0].to_i)
Issue #21836 has been updated by luke-gru (Luke Gruber). I'm confused about what should happen. Shouldn't it return roughly after 1 second instead of 10 seconds? I'll look into the sleep issue with `RUBY_MN_THREADS=1`, but I can't reproduce the deadlock or segfault with a more recent commit (ad6b85450d). ---------------------------------------- Bug #21836: RUBY_MN_THREADS deadlock and sleep issues https://bugs.ruby-lang.org/issues/21836#change-116198 * Author: jpl-coconut (Jacob Lacouture) * Status: Open * ruby -v: ruby 3.4.7 (2025-11-08) +PRISM [aarch64-linux] * Backport: 3.2: UNKNOWN, 3.3: UNKNOWN, 3.4: UNKNOWN, 4.0: UNKNOWN ---------------------------------------- I created a benchmark for the purpose of testing [a fix](https://github.com/ruby/ruby/pull/15840) for [Issue 21685](https://bugs.ruby-lang.org/issues/21685) The benchmark is inline below and saved as 100usec.rb. It should always take just a bit over 10 seconds, regardless of how many cores or threads are assigned. It should look like this: ```
time taskset --cpu-list 1 ./ruby 100usec.rb 1 real 0m10.130s user 0m0.094s sys 0m0.218s
It works fine with MN_THREADS disabled. However, with RUBY_MN_THREADS=1, I see two separate issues.
### Issue #1 is that when the number of cores is set to more than 1, the benchmark completes too fast.
These results are consistent across the ruby versions I tried. I did a little debugging and found that the call to `sleep(0.001)` (1ms) is returning after only 0.000005 seconds (5us).
time RUBY_MN_THREADS=1 taskset --cpu-list 1,2 ./ruby 100usec.rb 1 real 0m0.359s user 0m0.075s sys 0m0.134s
### Issue #2 shows up when the number of cores is limited to 1.
- On ruby 3.4.7 it runs very slow, about 6x slower than expected.
time RUBY_MN_THREADS=1 taskset --cpu-list 1 ./ruby 100usec.rb 1 real 1m2.277s user 0m0.249s sys 0m0.408s
- On ruby 4.0.0-preview2 it usually deadlocks, but sometimes it segfaults.
time RUBY_MN_THREADS=1 taskset --cpu-list 1 ./ruby ruby/100usec.rb 1 [BUG] unreachable ruby 4.0.0preview2 (2025-11-17 master 4fa6e9938c) +MN +PRISM [aarch64-linux]
-- Control frame information ----------------------------------------------- -- Threading information --------------------------------------------------- Total ractor count: 1 Ruby thread count for this ractor: 1 -- C level backtrace information ------------------------------------------- [BUG] Illegal instruction at 0x0000ffff93777250 ruby 4.0.0preview2 (2025-11-17 master 4fa6e9938c) +MN +PRISM [aarch64-linux] Crashed while printing bug report Illegal instruction ``` or ```
time RUBY_MN_THREADS=1 taskset --cpu-list 1 ./ruby ruby/100usec.rb 1 <<DEADLOCK>>
### The benchmark code (100usec.rb)
``` ruby
ITRCOUNT = 10000
def inner_test
r, w = IO.pipe
reader = Thread.new do
ITRCOUNT.times.map {|i|
r.getbyte
}
end
ITRCOUNT.times.map {|i|
sleep 0.0001
w.write('0')
}
reader.join
end
def outer_test(count)
count.times.map{|j|
Thread.new do
inner_test
end
}.each{|t| t.join}
end
outer_test(ARGV[0].to_i)
Issue #21836 has been updated by jpl-coconut (Jacob Lacouture). To close the loop here, yes, I was wrong: 1sec should be expected, not 10. Without MN_THREADS, the 0.1msec sleep becomes a 1msec sleep. With MN_THREADS, the 0.1msec sleep is skipped completely. I see the second of theses issues is now fixed. Thanks for the discussion and fix! ---------------------------------------- Bug #21836: RUBY_MN_THREADS deadlock and sleep issues https://bugs.ruby-lang.org/issues/21836#change-116219 * Author: jpl-coconut (Jacob Lacouture) * Status: Closed * ruby -v: ruby 3.4.7 (2025-11-08) +PRISM [aarch64-linux] * Backport: 3.2: UNKNOWN, 3.3: UNKNOWN, 3.4: UNKNOWN, 4.0: UNKNOWN ---------------------------------------- I created a benchmark for the purpose of testing [a fix](https://github.com/ruby/ruby/pull/15840) for [Issue 21685](https://bugs.ruby-lang.org/issues/21685) The benchmark is inline below and saved as 100usec.rb. It should always take just a bit over 10 seconds, regardless of how many cores or threads are assigned. It should look like this: ```
time taskset --cpu-list 1 ./ruby 100usec.rb 1 real 0m10.130s user 0m0.094s sys 0m0.218s
It works fine with MN_THREADS disabled. However, with RUBY_MN_THREADS=1, I see two separate issues.
### Issue #1 is that when the number of cores is set to more than 1, the benchmark completes too fast.
These results are consistent across the ruby versions I tried. I did a little debugging and found that the call to `sleep(0.001)` (1ms) is returning after only 0.000005 seconds (5us).
time RUBY_MN_THREADS=1 taskset --cpu-list 1,2 ./ruby 100usec.rb 1 real 0m0.359s user 0m0.075s sys 0m0.134s
### Issue #2 shows up when the number of cores is limited to 1.
- On ruby 3.4.7 it runs very slow, about 6x slower than expected.
time RUBY_MN_THREADS=1 taskset --cpu-list 1 ./ruby 100usec.rb 1 real 1m2.277s user 0m0.249s sys 0m0.408s
- On ruby 4.0.0-preview2 it usually deadlocks, but sometimes it segfaults.
time RUBY_MN_THREADS=1 taskset --cpu-list 1 ./ruby ruby/100usec.rb 1 [BUG] unreachable ruby 4.0.0preview2 (2025-11-17 master 4fa6e9938c) +MN +PRISM [aarch64-linux]
-- Control frame information ----------------------------------------------- -- Threading information --------------------------------------------------- Total ractor count: 1 Ruby thread count for this ractor: 1 -- C level backtrace information ------------------------------------------- [BUG] Illegal instruction at 0x0000ffff93777250 ruby 4.0.0preview2 (2025-11-17 master 4fa6e9938c) +MN +PRISM [aarch64-linux] Crashed while printing bug report Illegal instruction ``` or ```
time RUBY_MN_THREADS=1 taskset --cpu-list 1 ./ruby ruby/100usec.rb 1 <<DEADLOCK>>
### The benchmark code (100usec.rb)
``` ruby
ITRCOUNT = 10000
def inner_test
r, w = IO.pipe
reader = Thread.new do
ITRCOUNT.times.map {|i|
r.getbyte
}
end
ITRCOUNT.times.map {|i|
sleep 0.0001
w.write('0')
}
reader.join
end
def outer_test(count)
count.times.map{|j|
Thread.new do
inner_test
end
}.each{|t| t.join}
end
outer_test(ARGV[0].to_i)
participants (2)
-
jpl-coconut (Jacob Lacouture) -
luke-gru (Luke Gruber)