[ruby-core:124673] [Ruby Bug#21860] Process.fork: the child may deadlock on `th->interrupt_lock` in `threadptr_interrupt_exec_cleanup`
Issue #21860 has been reported by byroot (Jean Boussier). ---------------------------------------- Bug #21860: Process.fork: the child may deadlock on `th->interrupt_lock` in `threadptr_interrupt_exec_cleanup` https://bugs.ruby-lang.org/issues/21860 * Author: byroot (Jean Boussier) * Status: Open * ruby -v: ruby 3.4.4 (2025-05-14 revision a38531fd3f) +PRISM [aarch64-linux] * Backport: 3.2: UNKNOWN, 3.3: UNKNOWN, 3.4: UNKNOWN, 4.0: UNKNOWN ---------------------------------------- We recently observed some deadlocked processes. These got deadlock during child initialization right after `Process.fork`. Based on the `gdb` session, the child is deadlocked in `threadptr_interrupt_exec_cleanup`, which suggest `th->interrupt_lock` was locked by another thread in the parent, and wasn't reinitialized in the children before calling `threadptr_interrupt_exec_cleanup`. ```c (gdb) bt 20 #0 0x0000ffff855a09c0 in __lll_lock_wait () from /lib64/libc.so.6 #1 0x0000ffff855a6d60 in pthread_mutex_lock@@GLIBC_2.17 () from /lib64/libc.so.6 #2 0x0000aaaaaeec04c8 [PAC] in rb_native_mutex_lock (lock=<optimized out>) at /ruby-3.4.4/thread_pthread.c:116 #3 threadptr_interrupt_exec_cleanup (th=<optimized out>) at thread.c:6052 #4 thread_cleanup_func_before_exec (th_ptr=0xffff35803000) at thread.c:514 #5 thread_cleanup_func (atfork=1, th_ptr=0xffff35803000) at thread.c:524 #6 terminate_atfork_i (current_th=0xffff84e1e000, th=0xffff35803000) at thread.c:4769 #7 rb_thread_atfork_internal (atfork=<optimized out>, th=0xffff84e1e000) at thread.c:4736 #8 rb_thread_atfork () at thread.c:4779 #9 0x0000aaaaaee110fc [PAC] in after_fork_ruby (pid=0) at process.c:1693 #10 rb_fork_ruby (status=status@entry=0x0) at process.c:4253 #11 0x0000aaaaaee11154 [PAC] in proc_fork_pid () at process.c:4266 #12 rb_proc__fork (_obj=<optimized out>) at process.c:4313 #13 0x0000aaaaaeefeb7c [PAC] in vm_call_cfunc_with_frame_ (stack_bottom=0xffff84e7a1c8, argv=0xffff84e7a1d0, argc=0, calling=<optimized out>, reg_cfp=0xffff84f79150, ec=0xffff84e31050) at /ruby-3.4.4/vm_insnhelper.c:3794 #14 vm_call_cfunc_with_frame (ec=0xffff84e31050, reg_cfp=0xffff84f79150, calling=<optimized out>) at /ruby-3.4.4/vm_insnhelper.c:3840 #15 0x0000aaaaaef1aa44 [PAC] in vm_sendish (method_explorer=<optimized out>, block_handler=<optimized out>, cd=<optimized out>, reg_cfp=<optimized out>, ec=<optimized out>) at /ruby-3.4.4/vm_callinfo.h:415 #16 vm_exec_core (ec=0xffff84e31050) at /ruby-3.4.4/insns.def:1063 #17 0x0000aaaaaef0aa58 [PAC] in rb_vm_exec (ec=ec@entry=0xffff84e31050) at vm.c:2595 #18 0x0000aaaaaef0fe48 [PAC] in vm_call0_body (ec=ec@entry=0xffff84e31050, calling=calling@entry=0xffffea5a1e78, argv=argv@entry=0x0) at /ruby-3.4.4/vm_eval.c:225 #19 0x0000aaaaaef136e8 in vm_call0_cc (kw_splat=0, cc=0xfffeee654508, argv=<optimized out>, argc=0, id=27393, recv=281472906794680, ec=0xffff84e31050) at /ruby-3.4.4/vm_eval.c:101 ``` -- https://bugs.ruby-lang.org/
Issue #21860 has been updated by byroot (Jean Boussier). Oh, it has been fixed by https://github.com/ruby/ruby/pull/12981 ---------------------------------------- Bug #21860: Process.fork: the child may deadlock on `th->interrupt_lock` in `threadptr_interrupt_exec_cleanup` https://bugs.ruby-lang.org/issues/21860#change-116270 * Author: byroot (Jean Boussier) * Status: Open * ruby -v: ruby 3.4.4 (2025-05-14 revision a38531fd3f) +PRISM [aarch64-linux] * Backport: 3.2: UNKNOWN, 3.3: UNKNOWN, 3.4: UNKNOWN, 4.0: UNKNOWN ---------------------------------------- We recently observed some deadlocked processes. These got deadlock during child initialization right after `Process.fork`. Based on the `gdb` session, the child is deadlocked in `threadptr_interrupt_exec_cleanup`, which suggest `th->interrupt_lock` was locked by another thread in the parent, and wasn't reinitialized in the children before calling `threadptr_interrupt_exec_cleanup`. ```c (gdb) bt 20 #0 0x0000ffff855a09c0 in __lll_lock_wait () from /lib64/libc.so.6 #1 0x0000ffff855a6d60 in pthread_mutex_lock@@GLIBC_2.17 () from /lib64/libc.so.6 #2 0x0000aaaaaeec04c8 [PAC] in rb_native_mutex_lock (lock=<optimized out>) at /ruby-3.4.4/thread_pthread.c:116 #3 threadptr_interrupt_exec_cleanup (th=<optimized out>) at thread.c:6052 #4 thread_cleanup_func_before_exec (th_ptr=0xffff35803000) at thread.c:514 #5 thread_cleanup_func (atfork=1, th_ptr=0xffff35803000) at thread.c:524 #6 terminate_atfork_i (current_th=0xffff84e1e000, th=0xffff35803000) at thread.c:4769 #7 rb_thread_atfork_internal (atfork=<optimized out>, th=0xffff84e1e000) at thread.c:4736 #8 rb_thread_atfork () at thread.c:4779 #9 0x0000aaaaaee110fc [PAC] in after_fork_ruby (pid=0) at process.c:1693 #10 rb_fork_ruby (status=status@entry=0x0) at process.c:4253 #11 0x0000aaaaaee11154 [PAC] in proc_fork_pid () at process.c:4266 #12 rb_proc__fork (_obj=<optimized out>) at process.c:4313 #13 0x0000aaaaaeefeb7c [PAC] in vm_call_cfunc_with_frame_ (stack_bottom=0xffff84e7a1c8, argv=0xffff84e7a1d0, argc=0, calling=<optimized out>, reg_cfp=0xffff84f79150, ec=0xffff84e31050) at /ruby-3.4.4/vm_insnhelper.c:3794 #14 vm_call_cfunc_with_frame (ec=0xffff84e31050, reg_cfp=0xffff84f79150, calling=<optimized out>) at /ruby-3.4.4/vm_insnhelper.c:3840 #15 0x0000aaaaaef1aa44 [PAC] in vm_sendish (method_explorer=<optimized out>, block_handler=<optimized out>, cd=<optimized out>, reg_cfp=<optimized out>, ec=<optimized out>) at /ruby-3.4.4/vm_callinfo.h:415 #16 vm_exec_core (ec=0xffff84e31050) at /ruby-3.4.4/insns.def:1063 #17 0x0000aaaaaef0aa58 [PAC] in rb_vm_exec (ec=ec@entry=0xffff84e31050) at vm.c:2595 #18 0x0000aaaaaef0fe48 [PAC] in vm_call0_body (ec=ec@entry=0xffff84e31050, calling=calling@entry=0xffffea5a1e78, argv=argv@entry=0x0) at /ruby-3.4.4/vm_eval.c:225 #19 0x0000aaaaaef136e8 in vm_call0_cc (kw_splat=0, cc=0xfffeee654508, argv=<optimized out>, argc=0, id=27393, recv=281472906794680, ec=0xffff84e31050) at /ruby-3.4.4/vm_eval.c:101 ``` -- https://bugs.ruby-lang.org/
Issue #21860 has been updated by byroot (Jean Boussier). Status changed from Open to Closed 3.4 backport PR: https://github.com/ruby/ruby/pull/16060 3.3 backport PR: https://github.com/ruby/ruby/pull/16061 ---------------------------------------- Bug #21860: Process.fork: the child may deadlock on `th->interrupt_lock` in `threadptr_interrupt_exec_cleanup` https://bugs.ruby-lang.org/issues/21860#change-116272 * Author: byroot (Jean Boussier) * Status: Closed * ruby -v: ruby 3.4.4 (2025-05-14 revision a38531fd3f) +PRISM [aarch64-linux] * Backport: 3.2: WONTFIX, 3.3: REQUIRED, 3.4: REQUIRED, 4.0: DONTNEED ---------------------------------------- We recently observed some deadlocked processes. These got deadlock during child initialization right after `Process.fork`. Based on the `gdb` session, the child is deadlocked in `threadptr_interrupt_exec_cleanup`, which suggest `th->interrupt_lock` was locked by another thread in the parent, and wasn't reinitialized in the children before calling `threadptr_interrupt_exec_cleanup`. ```c (gdb) bt 20 #0 0x0000ffff855a09c0 in __lll_lock_wait () from /lib64/libc.so.6 #1 0x0000ffff855a6d60 in pthread_mutex_lock@@GLIBC_2.17 () from /lib64/libc.so.6 #2 0x0000aaaaaeec04c8 [PAC] in rb_native_mutex_lock (lock=<optimized out>) at /ruby-3.4.4/thread_pthread.c:116 #3 threadptr_interrupt_exec_cleanup (th=<optimized out>) at thread.c:6052 #4 thread_cleanup_func_before_exec (th_ptr=0xffff35803000) at thread.c:514 #5 thread_cleanup_func (atfork=1, th_ptr=0xffff35803000) at thread.c:524 #6 terminate_atfork_i (current_th=0xffff84e1e000, th=0xffff35803000) at thread.c:4769 #7 rb_thread_atfork_internal (atfork=<optimized out>, th=0xffff84e1e000) at thread.c:4736 #8 rb_thread_atfork () at thread.c:4779 #9 0x0000aaaaaee110fc [PAC] in after_fork_ruby (pid=0) at process.c:1693 #10 rb_fork_ruby (status=status@entry=0x0) at process.c:4253 #11 0x0000aaaaaee11154 [PAC] in proc_fork_pid () at process.c:4266 #12 rb_proc__fork (_obj=<optimized out>) at process.c:4313 #13 0x0000aaaaaeefeb7c [PAC] in vm_call_cfunc_with_frame_ (stack_bottom=0xffff84e7a1c8, argv=0xffff84e7a1d0, argc=0, calling=<optimized out>, reg_cfp=0xffff84f79150, ec=0xffff84e31050) at /ruby-3.4.4/vm_insnhelper.c:3794 #14 vm_call_cfunc_with_frame (ec=0xffff84e31050, reg_cfp=0xffff84f79150, calling=<optimized out>) at /ruby-3.4.4/vm_insnhelper.c:3840 #15 0x0000aaaaaef1aa44 [PAC] in vm_sendish (method_explorer=<optimized out>, block_handler=<optimized out>, cd=<optimized out>, reg_cfp=<optimized out>, ec=<optimized out>) at /ruby-3.4.4/vm_callinfo.h:415 #16 vm_exec_core (ec=0xffff84e31050) at /ruby-3.4.4/insns.def:1063 #17 0x0000aaaaaef0aa58 [PAC] in rb_vm_exec (ec=ec@entry=0xffff84e31050) at vm.c:2595 #18 0x0000aaaaaef0fe48 [PAC] in vm_call0_body (ec=ec@entry=0xffff84e31050, calling=calling@entry=0xffffea5a1e78, argv=argv@entry=0x0) at /ruby-3.4.4/vm_eval.c:225 #19 0x0000aaaaaef136e8 in vm_call0_cc (kw_splat=0, cc=0xfffeee654508, argv=<optimized out>, argc=0, id=27393, recv=281472906794680, ec=0xffff84e31050) at /ruby-3.4.4/vm_eval.c:101 ``` -- https://bugs.ruby-lang.org/
Issue #21860 has been updated by nagachika (Tomoyuki Chikanaga). Backport changed from 3.2: WONTFIX, 3.3: DONE, 3.4: REQUIRED, 4.0: DONTNEED to 3.2: WONTFIX, 3.3: DONE, 3.4: DONE, 4.0: DONTNEED ruby_3_4: merged at commit:43771bb0efcd139acd9112a770e8b8d719118dce. ---------------------------------------- Bug #21860: Process.fork: the child may deadlock on `th->interrupt_lock` in `threadptr_interrupt_exec_cleanup` https://bugs.ruby-lang.org/issues/21860#change-116629 * Author: byroot (Jean Boussier) * Status: Closed * ruby -v: ruby 3.4.4 (2025-05-14 revision a38531fd3f) +PRISM [aarch64-linux] * Backport: 3.2: WONTFIX, 3.3: DONE, 3.4: DONE, 4.0: DONTNEED ---------------------------------------- We recently observed some deadlocked processes. These got deadlock during child initialization right after `Process.fork`. Based on the `gdb` session, the child is deadlocked in `threadptr_interrupt_exec_cleanup`, which suggest `th->interrupt_lock` was locked by another thread in the parent, and wasn't reinitialized in the children before calling `threadptr_interrupt_exec_cleanup`. ```c (gdb) bt 20 #0 0x0000ffff855a09c0 in __lll_lock_wait () from /lib64/libc.so.6 #1 0x0000ffff855a6d60 in pthread_mutex_lock@@GLIBC_2.17 () from /lib64/libc.so.6 #2 0x0000aaaaaeec04c8 [PAC] in rb_native_mutex_lock (lock=<optimized out>) at /ruby-3.4.4/thread_pthread.c:116 #3 threadptr_interrupt_exec_cleanup (th=<optimized out>) at thread.c:6052 #4 thread_cleanup_func_before_exec (th_ptr=0xffff35803000) at thread.c:514 #5 thread_cleanup_func (atfork=1, th_ptr=0xffff35803000) at thread.c:524 #6 terminate_atfork_i (current_th=0xffff84e1e000, th=0xffff35803000) at thread.c:4769 #7 rb_thread_atfork_internal (atfork=<optimized out>, th=0xffff84e1e000) at thread.c:4736 #8 rb_thread_atfork () at thread.c:4779 #9 0x0000aaaaaee110fc [PAC] in after_fork_ruby (pid=0) at process.c:1693 #10 rb_fork_ruby (status=status@entry=0x0) at process.c:4253 #11 0x0000aaaaaee11154 [PAC] in proc_fork_pid () at process.c:4266 #12 rb_proc__fork (_obj=<optimized out>) at process.c:4313 #13 0x0000aaaaaeefeb7c [PAC] in vm_call_cfunc_with_frame_ (stack_bottom=0xffff84e7a1c8, argv=0xffff84e7a1d0, argc=0, calling=<optimized out>, reg_cfp=0xffff84f79150, ec=0xffff84e31050) at /ruby-3.4.4/vm_insnhelper.c:3794 #14 vm_call_cfunc_with_frame (ec=0xffff84e31050, reg_cfp=0xffff84f79150, calling=<optimized out>) at /ruby-3.4.4/vm_insnhelper.c:3840 #15 0x0000aaaaaef1aa44 [PAC] in vm_sendish (method_explorer=<optimized out>, block_handler=<optimized out>, cd=<optimized out>, reg_cfp=<optimized out>, ec=<optimized out>) at /ruby-3.4.4/vm_callinfo.h:415 #16 vm_exec_core (ec=0xffff84e31050) at /ruby-3.4.4/insns.def:1063 #17 0x0000aaaaaef0aa58 [PAC] in rb_vm_exec (ec=ec@entry=0xffff84e31050) at vm.c:2595 #18 0x0000aaaaaef0fe48 [PAC] in vm_call0_body (ec=ec@entry=0xffff84e31050, calling=calling@entry=0xffffea5a1e78, argv=argv@entry=0x0) at /ruby-3.4.4/vm_eval.c:225 #19 0x0000aaaaaef136e8 in vm_call0_cc (kw_splat=0, cc=0xfffeee654508, argv=<optimized out>, argc=0, id=27393, recv=281472906794680, ec=0xffff84e31050) at /ruby-3.4.4/vm_eval.c:101 ``` -- https://bugs.ruby-lang.org/
participants (2)
-
byroot (Jean Boussier) -
nagachika (Tomoyuki Chikanaga)