
Issue #21633 has been reported by ioquatix (Samuel Williams). ---------------------------------------- Bug #21633: A `rb_thread_call_without_gvl` loop can cause the fiber scheduler to ignore signals. https://bugs.ruby-lang.org/issues/21633 * Author: ioquatix (Samuel Williams) * Status: Open * Assignee: ioquatix (Samuel Williams) * Backport: 3.3: REQUIRED, 3.4: REQUIRED ---------------------------------------- The gRPC gem calls `rb_thread_call_without_gvl` in a loop, and doesn't exit when interrupts are delivered if `Thread.handle_interrupt(::SignalException => :never)` is used by the scheduler to create a safe point for asynchronous signal handling. While this may not be considered a bug in any particular part of the system, the combination of the behaviour creates a situation where gRPC can hang for a long time and ignores SIGINT / SIGTERM. ## gRPC Failure Analysis From [`src/ruby/ext/grpc/rb_completion_queue.c`](https://github.com/samuel-williams-shopify/grpc/blob/debug/src/ruby/ext/grpc...): ```c static void unblock_func(void* param) { next_call_stack* const next_call = (next_call_stack*)param; next_call->interrupted = 1; // ← SIGINT causes this flag to be set } grpc_event rb_completion_queue_pluck(grpc_completion_queue* queue, void* tag, gpr_timespec deadline, const char* reason) { // ... do { next_call.interrupted = 0; // ← Reset flag rb_thread_call_without_gvl(grpc_rb_completion_queue_pluck_no_gil, (void*)&next_call, unblock_func, (void*)&next_call); if (next_call.event.type != GRPC_QUEUE_TIMEOUT) break; } while (next_call.interrupted); // ← The problem! If interrupted, LOOP AGAIN! return next_call.event; } ``` The loop explicitly retries after interruption, making SIGINT/SIGTERM ineffective. This might be considered the expected behaviour if `Thread.handle_interrupt` is used. However, the goal of `Thread.handle_interrupt` in the fiber scheduler is to create a safe point for signal handling, not to prevent them completely. Since this loop never yields back to the scheduler, no such chance exists, and the loop will continue indefinitely. As `rb_thread_call_without_gvl` invokes `vm_check_ints_blocking`, one solution is to yield to the scheduler in the case that there are pending interrupts. This gives the scheduler a chance to handle the incoming SIGINT / SIGTERM signals at the safe point. For a full reproduction of the issue using gRPC: https://github.com/samuel-williams-shopify/grpc-interrupt For the proposed fix: https://github.com/ruby/ruby/pull/14700 -- https://bugs.ruby-lang.org/