[ruby-core:113788] [Ruby master Bug#19717] `ConditionVariable#signal` is not fair when the wakeup is consistently spurious.

List overview All Threads
Download

newer

older

[ruby-core:116140] [Ruby master...

[ruby-core:115684] [Ruby master...

ioquatix (Samuel Williams)

7 Jun 2023 7 Jun '23

11:04 a.m.

Issue #19717 has been reported by ioquatix (Samuel Williams). ---------------------------------------- Bug #19717: `ConditionVariable#signal` is not fair when the wakeup is consistently spurious. https://bugs.ruby-lang.org/issues/19717 * Author: ioquatix (Samuel Williams) * Status: Open * Priority: Normal * Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN ---------------------------------------- For background, see this issue <https://github.com/socketry/async/issues/99>. It looks like `ConditionVariable#signal` is not fair, if the calling thread immediately reacquires the resource. I've given a detailed reproduction here as it's non-trivial: <https://github.com/ioquatix/ruby-condition-variable-timeout>. Because the spurious wakeup occurs, the thread is pushed to the back of the waitq, which means any other waiting thread will acquire the resource, and that thread will perpetually be at the back of the queue. I believe the solution is to change `ConditionVarialbe#signal` should only remove the thread from the waitq if it's possible to acquire the lock. Otherwise it should be left in place, so that the order is retained, this should result in fair scheduling. -- https://bugs.ruby-lang.org/

Show replies by date

kjtsanaktsidis (KJ Tsanaktsidis)

11 Jun 11 Jun

1:33 p.m.

New subject: [ruby-core:113867] [Ruby master Bug#19717] `ConditionVariable#signal` is not fair when the wakeup is consistently spurious.

Issue #19717 has been updated by kjtsanaktsidis (KJ Tsanaktsidis). I spent a bit of time looking into this issue. I think I understand _why_ it's happening now, but I don't quite know what, if anything, we should do about it. Firstly, I fixed up the reproduction issue in a couple of ways - after these changes, both Ruby implementations, as well as the pure-C++ reproduction, show the same issue. Fixes: https://github.com/KJTsanaktsidis/ruby-condition-variable-timeout Then, I also patched the Ruby interpreter to add a few more USDT tracepoints to diagnose the issue: https://github.com/ruby/ruby/compare/master...KJTsanaktsidis:ruby:ktsanakts… ---- So, what's happening, in short, is a combination of a few things: 1. If a single thread has a mutex, signals a condition variable, unlocks the mutex, and then locks it again, it's very likely that the thread will be successful in acquiring the mutex again the second time immediately. This is because whilst the condvar signal makes another thread _eligible_ to run, the original thread still holds the GVL and will do so until it sleeps or does blocking I/O of some kind (or finishes its timeslice, but that's not relevant here). 2. Ruby condition variables are fair (as a byproduct of Ruby mutexes being fair). We maintain a list of threads that are waiting to be signaled, and wake them up from oldest to newest. 3. Ruby of course doesn't understand whether or not a "resource" is available when using a condition variable. It will wake up a thread when the condvar is signaled, but it's up the user code to decide if it can actually make forward progress at that point (i.e. there are spurious wakeups, from an application perspective, although not spurious wakeups from Ruby's point of view - something really did signal the condvar). 4. Because of the combination of these factors, when you have one thread performing the actions from point 1, it's going to be doing a spurious wakeup on the condvar; whatever thread which wakes up won't be able to make progress because the first thread acquired the resource again. 5. With the right combination of threads, concecutive release-acquire operations on a resource, and predictable delays, this can lead to starvation. In the reproduction example above, you have three threads and two consecutive release-acquire operations. That means that half of the condvar wakeups will be spurious; the wakeups which are immediately followed by the same thread re-acquiring the resource don't help maintain forward progress. However, it _does_ put the spuriously-woken-up thread to the back of the waitq. The resource ends up ping-ponging between two of the threads whilst the third is eternally starved out. 6.I wrote up a detailed log of which thread does what in what order here: https://gist.github.com/KJTsanaktsidis/ddfbccc7b48d90277ec81ebf16591cb8 --- This is not a Ruby problem per-se; the C++ example in @ioquatix's repo exhibits the same issue (after I fixed it). I believe glibc's pthread condition variables have fair queues like in Ruby (although notably glibc's _mutexes_ are not fair). Ruby perhaps makes the issue _more likely_ because the GVL adds more determinsim though. The best idea I have right now to "fix" this is to have `ConditionVariable#wait` take an optional kwarg that lets the caller report whether their last wakeup was spurious or not; if set, we could prepend the caller to the waitq rather than appending them. It would be much better if we could solve this problem without adding any API surface though IMO. I feel like this must be a problem with a lot of literature around it but perhaps I don't know the right words to look for. ---------------------------------------- Bug #19717: `ConditionVariable#signal` is not fair when the wakeup is consistently spurious. https://bugs.ruby-lang.org/issues/19717#change-103515 * Author: ioquatix (Samuel Williams) * Status: Open * Priority: Normal * Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN ---------------------------------------- For background, see this issue <https://github.com/socketry/async/issues/99>. It looks like `ConditionVariable#signal` is not fair, if the calling thread immediately reacquires the resource. I've given a detailed reproduction here as it's non-trivial: <https://github.com/ioquatix/ruby-condition-variable-timeout>. Because the spurious wakeup occurs, the thread is pushed to the back of the waitq, which means any other waiting thread will acquire the resource, and that thread will perpetually be at the back of the queue. I believe the solution is to change `ConditionVarialbe#signal` should only remove the thread from the waitq if it's possible to acquire the lock. Otherwise it should be left in place, so that the order is retained, this should result in fair scheduling. -- https://bugs.ruby-lang.org/

kjtsanaktsidis (KJ Tsanaktsidis)

2:29 p.m.

New subject: [ruby-core:113868] [Ruby master Bug#19717] `ConditionVariable#signal` is not fair when the wakeup is consistently spurious.

Issue #19717 has been updated by kjtsanaktsidis (KJ Tsanaktsidis). This is mostly just semantics, but I read some sources tonight - glibc doesn't actually do anything to implement "fair" condition variables. But, in the Linux kernel, the set of waiters waiting on a particular futex address is stored in a linked list. So a thread might be able to come along and "steal" a mutex (and hence the resource) if it's unlocked ahead of another thread that's been patiently waiting; this is the sense in which it's "unfair". But, if everybody's blocked waiting (i.e. doing `FUTEX_WAIT`), and then the condvar is signaled (using `FUTEX_WAKE`), which is almost certainly the case in all of these tests, then the thread which was waiting the longest will actually be the one woken, I think, just because of the linked list being traversed in order to do the wakeups. ---------------------------------------- Bug #19717: `ConditionVariable#signal` is not fair when the wakeup is consistently spurious. https://bugs.ruby-lang.org/issues/19717#change-103516 * Author: ioquatix (Samuel Williams) * Status: Open * Priority: Normal * Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN ---------------------------------------- For background, see this issue <https://github.com/socketry/async/issues/99>. It looks like `ConditionVariable#signal` is not fair, if the calling thread immediately reacquires the resource. I've given a detailed reproduction here as it's non-trivial: <https://github.com/ioquatix/ruby-condition-variable-timeout>. Because the spurious wakeup occurs, the thread is pushed to the back of the waitq, which means any other waiting thread will acquire the resource, and that thread will perpetually be at the back of the queue. I believe the solution is to change `ConditionVarialbe#signal` should only remove the thread from the waitq if it's possible to acquire the lock. Otherwise it should be left in place, so that the order is retained, this should result in fair scheduling. -- https://bugs.ruby-lang.org/

kjtsanaktsidis (KJ Tsanaktsidis)

2:32 p.m.

New subject: [ruby-core:113869] [Ruby master Bug#19717] `ConditionVariable#signal` is not fair when the wakeup is consistently spurious.

Issue #19717 has been updated by kjtsanaktsidis (KJ Tsanaktsidis).

...

have ConditionVariable#wait take an optional kwarg that lets the caller report whether their last wakeup was spurious or no

A better API might be to have the `ConditionVariable#wait` accept a block, and have Ruby do the looping & checking of the condition. Something like ``` @condvar.wait(mutex) do @resource.available? end ``` That's probably a nicer Ruby API anyway, and means it could transparently implement something like "append to the waitq on the first wait, and prepend thereafter". ---------------------------------------- Bug #19717: `ConditionVariable#signal` is not fair when the wakeup is consistently spurious. https://bugs.ruby-lang.org/issues/19717#change-103517 * Author: ioquatix (Samuel Williams) * Status: Open * Priority: Normal * Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN ---------------------------------------- For background, see this issue <https://github.com/socketry/async/issues/99>. It looks like `ConditionVariable#signal` is not fair, if the calling thread immediately reacquires the resource. I've given a detailed reproduction here as it's non-trivial: <https://github.com/ioquatix/ruby-condition-variable-timeout>. Because the spurious wakeup occurs, the thread is pushed to the back of the waitq, which means any other waiting thread will acquire the resource, and that thread will perpetually be at the back of the queue. I believe the solution is to change `ConditionVarialbe#signal` should only remove the thread from the waitq if it's possible to acquire the lock. Otherwise it should be left in place, so that the order is retained, this should result in fair scheduling. -- https://bugs.ruby-lang.org/

jeremyevans0 (Jeremy Evans)

12 Jun 12 Jun

3:15 a.m.

New subject: [ruby-core:113876] [Ruby master Feature#19717] `ConditionVariable#signal` is not fair when the wakeup is consistently spurious.

Issue #19717 has been updated by jeremyevans0 (Jeremy Evans). Tracker changed from Bug to Feature Backport deleted (3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN) Based on the thorough analysis by @kjtsanaktsidis, this doesn't appear to be a bug in Ruby, but rather common behavior of condition variable across implementations. The idea of `ConditionVariable#wait` taking a block to address the spurious wakeup issue seems interesting to me, but that would be a feature request. ---------------------------------------- Feature #19717: `ConditionVariable#signal` is not fair when the wakeup is consistently spurious. https://bugs.ruby-lang.org/issues/19717#change-103525 * Author: ioquatix (Samuel Williams) * Status: Open * Priority: Normal ---------------------------------------- For background, see this issue <https://github.com/socketry/async/issues/99>. It looks like `ConditionVariable#signal` is not fair, if the calling thread immediately reacquires the resource. I've given a detailed reproduction here as it's non-trivial: <https://github.com/ioquatix/ruby-condition-variable-timeout>. Because the spurious wakeup occurs, the thread is pushed to the back of the waitq, which means any other waiting thread will acquire the resource, and that thread will perpetually be at the back of the queue. I believe the solution is to change `ConditionVarialbe#signal` should only remove the thread from the waitq if it's possible to acquire the lock. Otherwise it should be left in place, so that the order is retained, this should result in fair scheduling. -- https://bugs.ruby-lang.org/

ioquatix (Samuel Williams)

3:18 a.m.

New subject: [ruby-core:113877] [Ruby master Feature#19717] `ConditionVariable#signal` is not fair when the wakeup is consistently spurious.

Issue #19717 has been updated by ioquatix (Samuel Williams).

...

the thread which was waiting the longest will actually be the one woken

To me, this is the correct behaviour and what I try to implement in Async. If there is a spurious wakeup, you don't loose your position in the queue. ---------------------------------------- Feature #19717: `ConditionVariable#signal` is not fair when the wakeup is consistently spurious. https://bugs.ruby-lang.org/issues/19717#change-103527 * Author: ioquatix (Samuel Williams) * Status: Open * Priority: Normal ---------------------------------------- For background, see this issue <https://github.com/socketry/async/issues/99>. It looks like `ConditionVariable#signal` is not fair, if the calling thread immediately reacquires the resource. I've given a detailed reproduction here as it's non-trivial: <https://github.com/ioquatix/ruby-condition-variable-timeout>. Because the spurious wakeup occurs, the thread is pushed to the back of the waitq, which means any other waiting thread will acquire the resource, and that thread will perpetually be at the back of the queue. I believe the solution is to change `ConditionVarialbe#signal` should only remove the thread from the waitq if it's possible to acquire the lock. Otherwise it should be left in place, so that the order is retained, this should result in fair scheduling. -- https://bugs.ruby-lang.org/

kjtsanaktsidis (KJ Tsanaktsidis)

8:42 a.m.

New subject: [ruby-core:113885] [Ruby master Feature#19717] `ConditionVariable#signal` is not fair when the wakeup is consistently spurious.

Issue #19717 has been updated by kjtsanaktsidis (KJ Tsanaktsidis).

...

I like the proposed implementation. Is there any chance it still has the similar problems? If we always reque at the front, could multiple waiters reque in front of each other? i.e. it still depends on the order.

I think it can't have similar problems as long as it's woken up by `#signal` - but if it's woken up by `#broadcast` you probably don't want this prepending behaviour. I think it's possible, with a bit of shuffling around in `thread_sync.c`, for the implementation to not actually remove the thread off the ConditionVariable's waitq until it's actually successfully acquired the resource; that way, if e.g. `#signal` is called twice, the two threads that are woken up will maintain their relative order in the waitq. I also noticed last night that `MonitorMixin::ConditionVariable` already has `#wait_until` and `#wait_for` methods that take a block. We could adjust the behaviour of these without any new API at all, which might be preferable? They're implemented by calling into `::ConditionVariable#wait` via `rb_funcall`, but we could hack it to call some internal-only APIs to achieve different waitq behaviour if we wanted. (Sidebar - when does one use `Monitor` in ruby instead of `Mutex`? Is it just that `Monitor` is re-entrant and `Mutex` is not?) Finally - I did some reading, and another magic phrase I found about this problem is "handoff semantics". We could make it so that when calling `ConditionVariable#signal` whilst holding the associated mutex, the signaled thread is guaranteed to acquire the mutex next after the current thread releases it; essentially, the mutex is "handed off". Because of the GVL, we would also need to "hand off" that off too and give it to the signaled thread. Essentially, when the signalling thread unlocks the mutex, it would need to stop and let the signaled thread run instead. The textbook downside of this approach is that it leads to increased context switching; in a real multi-CPU language, there might be another thread already running which is about to try and get the mutex, and it's best for _overall_ throughput if it's allowed to do so rather than stopping it and switching threads. In Ruby that applies too (if the signalling thread releases & immediately re-acquires the mutex, we can avoid a context switch). I also think there is a _worse_ problem with this approach in Ruby - the thread that did the signaling has to sleep and do _nothing_ after unlocking the mutex, whereas had it been allowed to run for a while, it might have started a blocking IO operation or called into a C extension or some such and yielded the GVL _whilst also doing something productive_. So, tl;dr, I don't think we should do this "handoff" approach. ---------------------------------------- Feature #19717: `ConditionVariable#signal` is not fair when the wakeup is consistently spurious. https://bugs.ruby-lang.org/issues/19717#change-103535 * Author: ioquatix (Samuel Williams) * Status: Open * Priority: Normal ---------------------------------------- For background, see this issue <https://github.com/socketry/async/issues/99>. It looks like `ConditionVariable#signal` is not fair, if the calling thread immediately reacquires the resource. I've given a detailed reproduction here as it's non-trivial: <https://github.com/ioquatix/ruby-condition-variable-timeout>. Because the spurious wakeup occurs, the thread is pushed to the back of the waitq, which means any other waiting thread will acquire the resource, and that thread will perpetually be at the back of the queue. I believe the solution is to change `ConditionVarialbe#signal` should only remove the thread from the waitq if it's possible to acquire the lock. Otherwise it should be left in place, so that the order is retained, this should result in fair scheduling. -- https://bugs.ruby-lang.org/

ioquatix (Samuel Williams)

8:45 a.m.

New subject: [ruby-core:113886] [Ruby master Feature#19717] `ConditionVariable#signal` is not fair when the wakeup is consistently spurious.

Issue #19717 has been updated by ioquatix (Samuel Williams).

...

I think it's possible, with a bit of shuffling around in thread_sync.c, for the implementation to not actually remove the thread off the ConditionVariable's waitq until it's actually successfully acquired the resource; that way, if e.g. #signal is called twice, the two threads that are woken up will maintain their relative order in the waitq.

Yes, I think that would be a good idea if we can do it. ---------------------------------------- Feature #19717: `ConditionVariable#signal` is not fair when the wakeup is consistently spurious. https://bugs.ruby-lang.org/issues/19717#change-103536 * Author: ioquatix (Samuel Williams) * Status: Open * Priority: Normal ---------------------------------------- For background, see this issue <https://github.com/socketry/async/issues/99>. It looks like `ConditionVariable#signal` is not fair, if the calling thread immediately reacquires the resource. I've given a detailed reproduction here as it's non-trivial: <https://github.com/ioquatix/ruby-condition-variable-timeout>. Because the spurious wakeup occurs, the thread is pushed to the back of the waitq, which means any other waiting thread will acquire the resource, and that thread will perpetually be at the back of the queue. I believe the solution is to change `ConditionVarialbe#signal` should only remove the thread from the waitq if it's possible to acquire the lock. Otherwise it should be left in place, so that the order is retained, this should result in fair scheduling. -- https://bugs.ruby-lang.org/

kjtsanaktsidis (KJ Tsanaktsidis)

8:55 a.m.

New subject: [ruby-core:113887] [Ruby master Feature#19717] `ConditionVariable#signal` is not fair when the wakeup is consistently spurious.

Issue #19717 has been updated by kjtsanaktsidis (KJ Tsanaktsidis).

...

So, tl;dr, I don't think we should do this "handoff" approach.

After reading this I realised we don't actually have to hand off the GVL when the mutex is handed off. The thread that signalled the condition variable and unlocked the mutex could keep running, and the mutex could then be "reserved" for the thread that was woken up. If the unlocking thread then attempted to lock the mutex again, _only then_ would it be put to sleep and a switch to the thread holding the "reservation" would happen. Yes, it's more context switches (only if the unlocking thread tries to lock again), but that probably doesn't matter so much on the scale of Ruby performance. The unlocking thread can continue to run after unlocking, though. (Sorry for my real-time musing on the bugtracker) ---------------------------------------- Feature #19717: `ConditionVariable#signal` is not fair when the wakeup is consistently spurious. https://bugs.ruby-lang.org/issues/19717#change-103537 * Author: ioquatix (Samuel Williams) * Status: Open * Priority: Normal ---------------------------------------- For background, see this issue <https://github.com/socketry/async/issues/99>. It looks like `ConditionVariable#signal` is not fair, if the calling thread immediately reacquires the resource. I've given a detailed reproduction here as it's non-trivial: <https://github.com/ioquatix/ruby-condition-variable-timeout>. Because the spurious wakeup occurs, the thread is pushed to the back of the waitq, which means any other waiting thread will acquire the resource, and that thread will perpetually be at the back of the queue. I believe the solution is to change `ConditionVarialbe#signal` should only remove the thread from the waitq if it's possible to acquire the lock. Otherwise it should be left in place, so that the order is retained, this should result in fair scheduling. -- https://bugs.ruby-lang.org/

ioquatix (Samuel Williams)

21 Jun 21 Jun

8:53 a.m.

New subject: [ruby-core:113960] [Ruby master Feature#19717] `ConditionVariable#signal` is not fair when the wakeup is consistently spurious.

Issue #19717 has been updated by ioquatix (Samuel Williams). I like this idea. ---------------------------------------- Feature #19717: `ConditionVariable#signal` is not fair when the wakeup is consistently spurious. https://bugs.ruby-lang.org/issues/19717#change-103614 * Author: ioquatix (Samuel Williams) * Status: Open * Priority: Normal ---------------------------------------- For background, see this issue <https://github.com/socketry/async/issues/99>. It looks like `ConditionVariable#signal` is not fair, if the calling thread immediately reacquires the resource. I've given a detailed reproduction here as it's non-trivial: <https://github.com/ioquatix/ruby-condition-variable-timeout>. Because the spurious wakeup occurs, the thread is pushed to the back of the waitq, which means any other waiting thread will acquire the resource, and that thread will perpetually be at the back of the queue. I believe the solution is to change `ConditionVarialbe#signal` should only remove the thread from the waitq if it's possible to acquire the lock. Otherwise it should be left in place, so that the order is retained, this should result in fair scheduling. -- https://bugs.ruby-lang.org/

kjtsanaktsidis (KJ Tsanaktsidis)

24 Jun 24 Jun

8:46 a.m.

New subject: [ruby-core:114016] [Ruby master Feature#19717] `ConditionVariable#signal` is not fair when the wakeup is consistently spurious.

Issue #19717 has been updated by kjtsanaktsidis (KJ Tsanaktsidis). I finished spiking out a bit of a POC of how this kind of handoff could work - https://github.com/ruby/ruby/pull/7977 I'm sure it is chock-a-block full of other fun bugs, but it seems to make your reproduction script work; it runs to completion in 10 seconds now instead of crashing after ~500ms. WDYT? Is this an approach that's worth pursuing? ---------------------------------------- Feature #19717: `ConditionVariable#signal` is not fair when the wakeup is consistently spurious. https://bugs.ruby-lang.org/issues/19717#change-103678 * Author: ioquatix (Samuel Williams) * Status: Open * Priority: Normal ---------------------------------------- For background, see this issue <https://github.com/socketry/async/issues/99>. It looks like `ConditionVariable#signal` is not fair, if the calling thread immediately reacquires the resource. I've given a detailed reproduction here as it's non-trivial: <https://github.com/ioquatix/ruby-condition-variable-timeout>. Because the spurious wakeup occurs, the thread is pushed to the back of the waitq, which means any other waiting thread will acquire the resource, and that thread will perpetually be at the back of the queue. I believe the solution is to change `ConditionVarialbe#signal` should only remove the thread from the waitq if it's possible to acquire the lock. Otherwise it should be left in place, so that the order is retained, this should result in fair scheduling. -- https://bugs.ruby-lang.org/

kjtsanaktsidis (KJ Tsanaktsidis)

7 Jul 7 Jul

12:04 a.m.

New subject: [ruby-core:114102] [Ruby master Feature#19717] `ConditionVariable#signal` is not fair when the wakeup is consistently spurious.

Issue #19717 has been updated by kjtsanaktsidis (KJ Tsanaktsidis). @ioquatix any thboughts? :) ---------------------------------------- Feature #19717: `ConditionVariable#signal` is not fair when the wakeup is consistently spurious. https://bugs.ruby-lang.org/issues/19717#change-103778 * Author: ioquatix (Samuel Williams) * Status: Open * Priority: Normal ---------------------------------------- For background, see this issue <https://github.com/socketry/async/issues/99>. It looks like `ConditionVariable#signal` is not fair, if the calling thread immediately reacquires the resource. I've given a detailed reproduction here as it's non-trivial: <https://github.com/ioquatix/ruby-condition-variable-timeout>. Because the spurious wakeup occurs, the thread is pushed to the back of the waitq, which means any other waiting thread will acquire the resource, and that thread will perpetually be at the back of the queue. I believe the solution is to change `ConditionVarialbe#signal` should only remove the thread from the waitq if it's possible to acquire the lock. Otherwise it should be left in place, so that the order is retained, this should result in fair scheduling. -- https://bugs.ruby-lang.org/

kjtsanaktsidis (KJ Tsanaktsidis)

10 Jan 10 Jan

10:32 a.m.

New subject: [ruby-core:116146] [Ruby master Feature#19717] `ConditionVariable#signal` is not fair when the wakeup is consistently spurious.

Issue #19717 has been updated by kjtsanaktsidis (KJ Tsanaktsidis). @ioquatix do we want to do anything about this? This really did stem from a real bug in somebody's code it seems (https://github.com/socketry/async/issues/99) so I guess it would be good to do something. The "something"s we cooked up so far seem to be... * Add a ConditionVariable#wait_for method which takes a block and returns whether or not the wakeup was actually not spurious (from the application's perspective - i.e. can the thread make forward progress now, or does it need to wait on the condvar again?). Pros: admits a fairly simple implementation, and the API kind of feels more ruby-like anyway; Cons: requires code to actually use the new API * Make Mutex#unlock cause the calling thread to give up the GVL and schedule the thread waiting for the mutex instead. Pros: sort of simple, solves the problem for users of the current API Cons: probably not great for performance (the calling thread might have been about to do some IO anyway, so if we'd just given it a bit more time it would be sleeping doing something productive (waiting for IO) instead of just being blocked on the GVL). * Make Mutex be "reserved" for a thread if it's blocked on Mutex#lock, so that if the thread which currently has the mutex calls #unlock and then #lock again, it will find the mutex is reserved for someone else and transfer control. Pros: solves the problem for users of the current API; Cons: Pretty complex (https://github.com/ruby/ruby/pull/8331) What is the next step for this? Should I put this on the agenda for a dev meeting to discuss? Something else? ---------------------------------------- Feature #19717: `ConditionVariable#signal` is not fair when the wakeup is consistently spurious. https://bugs.ruby-lang.org/issues/19717#change-106151 * Author: ioquatix (Samuel Williams) * Status: Open * Priority: Normal ---------------------------------------- For background, see this issue <https://github.com/socketry/async/issues/99>. It looks like `ConditionVariable#signal` is not fair, if the calling thread immediately reacquires the resource. I've given a detailed reproduction here as it's non-trivial: <https://github.com/ioquatix/ruby-condition-variable-timeout>. Because the spurious wakeup occurs, the thread is pushed to the back of the waitq, which means any other waiting thread will acquire the resource, and that thread will perpetually be at the back of the queue. I believe the solution is to change `ConditionVarialbe#signal` should only remove the thread from the waitq if it's possible to acquire the lock. Otherwise it should be left in place, so that the order is retained, this should result in fair scheduling. -- https://bugs.ruby-lang.org/

130

days inactive

347

days old

ruby-core@ml.ruby-lang.org

Manage subscription

12 comments

3 participants

tags (0)

participants (3)

ioquatix (Samuel Williams)
jeremyevans0 (Jeremy Evans)
kjtsanaktsidis (KJ Tsanaktsidis)