[ruby-core:124563] [Ruby Bug#21840] Locking a mutex can lead to starvation
Issue #21840 has been reported by luke-gru (Luke Gruber). ---------------------------------------- Bug #21840: Locking a mutex can lead to starvation https://bugs.ruby-lang.org/issues/21840 * Author: luke-gru (Luke Gruber) * Status: Open * Assignee: luke-gru (Luke Gruber) * Target version: 4.0 * Backport: 3.2: UNKNOWN, 3.3: UNKNOWN, 3.4: UNKNOWN, 4.0: UNKNOWN ---------------------------------------- Continually locking a mutex `m` can lead to starvation if all other threads are on the waitq of `m`. Let `T` be the thread that keeps on acquiring mutex `m` in a loop. Iteration 1: 1) `T` locks mutex `m` 2) All other threads attempt to acquire `m` and end up on its waitq 3) `T` releases the GVL, doesn't wake any other threads because they're all waiting on `m` 4) `T` returns from the blocking function, resets its `running_time` to 0, acquires the GVL and continues running 5) `T` unlocks `m`. It adds the head of the waitq (`T2`) to the thread readyq. `T` continues running Iteration 2: 1) `T` locks mutex `m` 2) `T` calls a blocking function. This time, it dequeues the readyq (ex: `T2`) and sends it the wakeup signal. `T` runs its blocking function 3) `T2` wakes up, acquires the GVL and attempts to lock mutex `m`. It fails and goes back asleep, putting itself back on the waitq of `m`. 4) `T` returns from the blocking function, sets its `running_time` back to 0, acquires the GVL and keeps running 5) `T` unlocks `m`. It adds the head of the waitq (ex: `T3`) to the end of the thread readyq. `T` continues running. Repeat Iteration 2 except the head of readyq is now `T3` The problem is that `T` can never be pre-empted. Example script: ```ruby m = Mutex.new def fib(n) return n if n <= 1 fib(n - 1) + fib(n - 2) end t1_running = false t1 = Thread.new do t1_running = true loop do fib(20) m.synchronize do $stderr.puts "t1 iter" end end end loop until t1_running 5.times.map do Thread.new do m.synchronize do end end end.each(&:join) ``` -- https://bugs.ruby-lang.org/
Issue #21840 has been updated by ivoanjo (Ivo Anjo). Really cool! I wonder if this will end up solving https://bugs.ruby-lang.org/issues/19717 as well / https://github.com/ruby/ruby/pull/8331 is an old PR for it. ---------------------------------------- Bug #21840: Locking a mutex can lead to starvation https://bugs.ruby-lang.org/issues/21840#change-116228 * Author: luke-gru (Luke Gruber) * Status: Closed * Assignee: luke-gru (Luke Gruber) * Target version: 4.1 * ruby -v: 4.1.0dev * Backport: 3.2: UNKNOWN, 3.3: UNKNOWN, 3.4: REQUIRED, 4.0: REQUIRED ---------------------------------------- Continually locking a mutex `m` can lead to starvation if all other threads are on the waitq of `m`. Let `T` be the thread that keeps on acquiring mutex `m` in a loop. Iteration 1: 1) `T` locks mutex `m` 2) All other threads attempt to acquire `m` and end up on its waitq 3) `T` releases the GVL, doesn't wake any other threads because they're all waiting on `m` 4) `T` returns from the blocking function, resets its `running_time` to 0, acquires the GVL and continues running 5) `T` unlocks `m`. It adds the head of the waitq (`T2`) to the thread readyq. `T` continues running Iteration 2: 1) `T` locks mutex `m` 2) `T` calls a blocking function. This time, it dequeues the readyq (ex: `T2`) and sends it the wakeup signal. `T` runs its blocking function 3) `T2` wakes up, acquires the GVL and attempts to lock mutex `m`. It fails and goes back asleep, putting itself back on the waitq of `m`. 4) `T` returns from the blocking function, sets its `running_time` back to 0, acquires the GVL and keeps running 5) `T` unlocks `m`. It adds the head of the waitq (ex: `T3`) to the end of the thread readyq. `T` continues running. Repeat Iteration 2 except the head of readyq is now `T3` The problem is that `T` can never be pre-empted. Example script: ```ruby m = Mutex.new def fib(n) return n if n <= 1 fib(n - 1) + fib(n - 2) end t1_running = false t1 = Thread.new do t1_running = true loop do fib(20) m.synchronize do $stderr.puts "t1 iter" end end end loop until t1_running 5.times.map do Thread.new do m.synchronize do end end end.each(&:join) ``` -- https://bugs.ruby-lang.org/
Issue #21840 has been updated by luke-gru (Luke Gruber). Status changed from Closed to Open I wasn't aware of those old issues. I'll take a look, thanks! I reverted the commit because of issues with `Monitor` tests in CI. I thought they might be related to this change, but it kept failing even after the revert. It turns out it was a badly designed test and I've fixed it. I plan on recommitting, but I may change the implementation. ---------------------------------------- Bug #21840: Locking a mutex can lead to starvation https://bugs.ruby-lang.org/issues/21840#change-116229 * Author: luke-gru (Luke Gruber) * Status: Open * Assignee: luke-gru (Luke Gruber) * Target version: 4.1 * ruby -v: 4.1.0dev * Backport: 3.2: UNKNOWN, 3.3: UNKNOWN, 3.4: REQUIRED, 4.0: REQUIRED ---------------------------------------- Continually locking a mutex `m` can lead to starvation if all other threads are on the waitq of `m`. Let `T` be the thread that keeps on acquiring mutex `m` in a loop. Iteration 1: 1) `T` locks mutex `m` 2) All other threads attempt to acquire `m` and end up on its waitq 3) `T` releases the GVL, doesn't wake any other threads because they're all waiting on `m` 4) `T` returns from the blocking function, resets its `running_time` to 0, acquires the GVL and continues running 5) `T` unlocks `m`. It adds the head of the waitq (`T2`) to the thread readyq. `T` continues running Iteration 2: 1) `T` locks mutex `m` 2) `T` calls a blocking function. This time, it dequeues the readyq (ex: `T2`) and sends it the wakeup signal. `T` runs its blocking function 3) `T2` wakes up, acquires the GVL and attempts to lock mutex `m`. It fails and goes back asleep, putting itself back on the waitq of `m`. 4) `T` returns from the blocking function, sets its `running_time` back to 0, acquires the GVL and keeps running 5) `T` unlocks `m`. It adds the head of the waitq (ex: `T3`) to the end of the thread readyq. `T` continues running. Repeat Iteration 2 except the head of readyq is now `T3` The problem is that `T` can never be pre-empted. Example script: ```ruby m = Mutex.new def fib(n) return n if n <= 1 fib(n - 1) + fib(n - 2) end t1_running = false t1 = Thread.new do t1_running = true loop do fib(20) m.synchronize do $stderr.puts "t1 iter" end end end loop until t1_running 5.times.map do Thread.new do m.synchronize do end end end.each(&:join) ``` -- https://bugs.ruby-lang.org/
participants (2)
-
ivoanjo (Ivo Anjo) -
luke-gru (Luke Gruber)