[ruby-core:114380] [Ruby master Bug#19837] Concurrent calls to Process.waitpid2 misbehave on Ruby 3.1 & 3.2

Issue #19837 has been reported by kjtsanaktsidis (KJ Tsanaktsidis). ---------------------------------------- Bug #19837: Concurrent calls to Process.waitpid2 misbehave on Ruby 3.1 & 3.2 https://bugs.ruby-lang.org/issues/19837 * Author: kjtsanaktsidis (KJ Tsanaktsidis) * Status: Open * Priority: Normal * ruby -v: ruby 3.1.4p236 (2023-07-26 revision a8670865c0) [arm64-darwin22] * Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN ---------------------------------------- On Ruby 3.1 & 3.2, if you have one thread blocked into a directed call to `Process.waitpid2` with a pid specified, a concurrent call to `Process.waitpid2 -1` will not be able to find & reap any other terminated child process, even one with a different pid that is not individually being waited on. I've attached a Ruby program which should terminate but doesn't as a result of this bug, as well as a C program which demonstrates that the underlying syscalls (at least on Linux) do behave how you would expect. My reproduction creates two processes; a long-running process that does not exit, and a short one which does. There is a background thread calling `Process.waitpid2` on the long process. Then, a concurrent call to `Process.waitpid2 -1` does not notice that the short-running process has exited. My `wait_bug.rb` program _does_ work properly and terminate on the current master branch of Ruby; I assume this is because all of the MJIT-related process management stuff with the waiting_pids & stuff has been cleaned up as part of the MJIT -> RJIT refactoring. Because of this, I'm not sure exactly how to make a patch; should I open a pair of PRs targeting the `ruby_3_2` and `ruby_3_1` branch? Thanks! ---Files-------------------------------- wait_bug.c (1.55 KB) wait_bug.rb (735 Bytes) -- https://bugs.ruby-lang.org/

Issue #19837 has been updated by jeremyevans0 (Jeremy Evans). @kjtsanaktsidis Yes, please prepare backport pull requests for ruby_3_2 and ruby_3_1, then we can update the backport flags in this issue to let the branch maintainer know it is ready for backporting. ---------------------------------------- Bug #19837: Concurrent calls to Process.waitpid2 misbehave on Ruby 3.1 & 3.2 https://bugs.ruby-lang.org/issues/19837#change-104149 * Author: kjtsanaktsidis (KJ Tsanaktsidis) * Status: Open * Priority: Normal * ruby -v: ruby 3.1.4p236 (2023-07-26 revision a8670865c0) [arm64-darwin22] * Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN ---------------------------------------- On Ruby 3.1 & 3.2, if you have one thread blocked into a directed call to `Process.waitpid2` with a pid specified, a concurrent call to `Process.waitpid2 -1` will not be able to find & reap any other terminated child process, even one with a different pid that is not individually being waited on. I've attached a Ruby program which should terminate but doesn't as a result of this bug, as well as a C program which demonstrates that the underlying syscalls (at least on Linux) do behave how you would expect. My reproduction creates two processes; a long-running process that does not exit, and a short one which does. There is a background thread calling `Process.waitpid2` on the long process. Then, a concurrent call to `Process.waitpid2 -1` does not notice that the short-running process has exited. My `wait_bug.rb` program _does_ work properly and terminate on the current master branch of Ruby; I assume this is because all of the MJIT-related process management stuff with the waiting_pids & stuff has been cleaned up as part of the MJIT -> RJIT refactoring. Because of this, I'm not sure exactly how to make a patch; should I open a pair of PRs targeting the `ruby_3_2` and `ruby_3_1` branch? Thanks! ---Files-------------------------------- wait_bug.c (1.55 KB) wait_bug.rb (735 Bytes) -- https://bugs.ruby-lang.org/

Issue #19837 has been updated by kjtsanaktsidis (KJ Tsanaktsidis). OK - I hope I've done this right, let me know otherwise :) * Ruby 3.0: https://github.com/ruby/ruby/pull/8248 * Ruby 3.1: https://github.com/ruby/ruby/pull/8246 * Ruby 3.2: https://github.com/ruby/ruby/pull/8247 Also, I opened a PR on the main branch which adds _just_ the test I wrote for this issue - the existing implementation passes it though so there are no other changes needed: https://github.com/ruby/ruby/pull/8245 ---------------------------------------- Bug #19837: Concurrent calls to Process.waitpid2 misbehave on Ruby 3.1 & 3.2 https://bugs.ruby-lang.org/issues/19837#change-104155 * Author: kjtsanaktsidis (KJ Tsanaktsidis) * Status: Open * Priority: Normal * ruby -v: ruby 3.1.4p236 (2023-07-26 revision a8670865c0) [arm64-darwin22] * Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN ---------------------------------------- On Ruby 3.1 & 3.2, if you have one thread blocked into a directed call to `Process.waitpid2` with a pid specified, a concurrent call to `Process.waitpid2 -1` will not be able to find & reap any other terminated child process, even one with a different pid that is not individually being waited on. I've attached a Ruby program which should terminate but doesn't as a result of this bug, as well as a C program which demonstrates that the underlying syscalls (at least on Linux) do behave how you would expect. My reproduction creates two processes; a long-running process that does not exit, and a short one which does. There is a background thread calling `Process.waitpid2` on the long process. Then, a concurrent call to `Process.waitpid2 -1` does not notice that the short-running process has exited. My `wait_bug.rb` program _does_ work properly and terminate on the current master branch of Ruby; I assume this is because all of the MJIT-related process management stuff with the waiting_pids & stuff has been cleaned up as part of the MJIT -> RJIT refactoring. Because of this, I'm not sure exactly how to make a patch; should I open a pair of PRs targeting the `ruby_3_2` and `ruby_3_1` branch? Thanks! ---Files-------------------------------- wait_bug.c (1.55 KB) wait_bug.rb (735 Bytes) -- https://bugs.ruby-lang.org/

Issue #19837 has been updated by jeremyevans0 (Jeremy Evans). Status changed from Open to Closed Backport changed from 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN to 3.0: UNKNOWN, 3.1: REQUIRED, 3.2: REQUIRED This updates the backport flags. Note that Ruby 3.0 is in security maintenance, and this doesn't appear to be security related, so it is unlikely to be backported to Ruby 3.0. ---------------------------------------- Bug #19837: Concurrent calls to Process.waitpid2 misbehave on Ruby 3.1 & 3.2 https://bugs.ruby-lang.org/issues/19837#change-104156 * Author: kjtsanaktsidis (KJ Tsanaktsidis) * Status: Closed * Priority: Normal * ruby -v: ruby 3.1.4p236 (2023-07-26 revision a8670865c0) [arm64-darwin22] * Backport: 3.0: UNKNOWN, 3.1: REQUIRED, 3.2: REQUIRED ---------------------------------------- On Ruby 3.1 & 3.2, if you have one thread blocked into a directed call to `Process.waitpid2` with a pid specified, a concurrent call to `Process.waitpid2 -1` will not be able to find & reap any other terminated child process, even one with a different pid that is not individually being waited on. I've attached a Ruby program which should terminate but doesn't as a result of this bug, as well as a C program which demonstrates that the underlying syscalls (at least on Linux) do behave how you would expect. My reproduction creates two processes; a long-running process that does not exit, and a short one which does. There is a background thread calling `Process.waitpid2` on the long process. Then, a concurrent call to `Process.waitpid2 -1` does not notice that the short-running process has exited. My `wait_bug.rb` program _does_ work properly and terminate on the current master branch of Ruby; I assume this is because all of the MJIT-related process management stuff with the waiting_pids & stuff has been cleaned up as part of the MJIT -> RJIT refactoring. Because of this, I'm not sure exactly how to make a patch; should I open a pair of PRs targeting the `ruby_3_2` and `ruby_3_1` branch? Thanks! ---Files-------------------------------- wait_bug.c (1.55 KB) wait_bug.rb (735 Bytes) -- https://bugs.ruby-lang.org/

Issue #19837 has been updated by kjtsanaktsidis (KJ Tsanaktsidis). Thanks - yeah it’s no problem if it doesn’t get back ported to 3.0. ---------------------------------------- Bug #19837: Concurrent calls to Process.waitpid2 misbehave on Ruby 3.1 & 3.2 https://bugs.ruby-lang.org/issues/19837#change-104157 * Author: kjtsanaktsidis (KJ Tsanaktsidis) * Status: Closed * Priority: Normal * ruby -v: ruby 3.1.4p236 (2023-07-26 revision a8670865c0) [arm64-darwin22] * Backport: 3.0: UNKNOWN, 3.1: REQUIRED, 3.2: REQUIRED ---------------------------------------- On Ruby 3.1 & 3.2, if you have one thread blocked into a directed call to `Process.waitpid2` with a pid specified, a concurrent call to `Process.waitpid2 -1` will not be able to find & reap any other terminated child process, even one with a different pid that is not individually being waited on. I've attached a Ruby program which should terminate but doesn't as a result of this bug, as well as a C program which demonstrates that the underlying syscalls (at least on Linux) do behave how you would expect. My reproduction creates two processes; a long-running process that does not exit, and a short one which does. There is a background thread calling `Process.waitpid2` on the long process. Then, a concurrent call to `Process.waitpid2 -1` does not notice that the short-running process has exited. My `wait_bug.rb` program _does_ work properly and terminate on the current master branch of Ruby; I assume this is because all of the MJIT-related process management stuff with the waiting_pids & stuff has been cleaned up as part of the MJIT -> RJIT refactoring. Because of this, I'm not sure exactly how to make a patch; should I open a pair of PRs targeting the `ruby_3_2` and `ruby_3_1` branch? Thanks! ---Files-------------------------------- wait_bug.c (1.55 KB) wait_bug.rb (735 Bytes) -- https://bugs.ruby-lang.org/

Issue #19837 has been updated by usa (Usaku NAKAMURA). Backport changed from 3.0: UNKNOWN, 3.1: REQUIRED, 3.2: REQUIRED to 3.0: UNKNOWN, 3.1: DONE, 3.2: REQUIRED merged PR#8246 into ruby_3_1 thx! ---------------------------------------- Bug #19837: Concurrent calls to Process.waitpid2 misbehave on Ruby 3.1 & 3.2 https://bugs.ruby-lang.org/issues/19837#change-104459 * Author: kjtsanaktsidis (KJ Tsanaktsidis) * Status: Closed * Priority: Normal * ruby -v: ruby 3.1.4p236 (2023-07-26 revision a8670865c0) [arm64-darwin22] * Backport: 3.0: UNKNOWN, 3.1: DONE, 3.2: REQUIRED ---------------------------------------- On Ruby 3.1 & 3.2, if you have one thread blocked into a directed call to `Process.waitpid2` with a pid specified, a concurrent call to `Process.waitpid2 -1` will not be able to find & reap any other terminated child process, even one with a different pid that is not individually being waited on. I've attached a Ruby program which should terminate but doesn't as a result of this bug, as well as a C program which demonstrates that the underlying syscalls (at least on Linux) do behave how you would expect. My reproduction creates two processes; a long-running process that does not exit, and a short one which does. There is a background thread calling `Process.waitpid2` on the long process. Then, a concurrent call to `Process.waitpid2 -1` does not notice that the short-running process has exited. My `wait_bug.rb` program _does_ work properly and terminate on the current master branch of Ruby; I assume this is because all of the MJIT-related process management stuff with the waiting_pids & stuff has been cleaned up as part of the MJIT -> RJIT refactoring. Because of this, I'm not sure exactly how to make a patch; should I open a pair of PRs targeting the `ruby_3_2` and `ruby_3_1` branch? Thanks! ---Files-------------------------------- wait_bug.c (1.55 KB) wait_bug.rb (735 Bytes) -- https://bugs.ruby-lang.org/

Issue #19837 has been updated by nagachika (Tomoyuki Chikanaga). Backport changed from 3.0: UNKNOWN, 3.1: DONE, 3.2: REQUIRED to 3.0: UNKNOWN, 3.1: DONE, 3.2: DONE Merged into ruby_3_2 branch at 0b7a4fbaa9c56d2c67d00d86c69f9e5c71803267. Thank you! ---------------------------------------- Bug #19837: Concurrent calls to Process.waitpid2 misbehave on Ruby 3.1 & 3.2 https://bugs.ruby-lang.org/issues/19837#change-104502 * Author: kjtsanaktsidis (KJ Tsanaktsidis) * Status: Closed * Priority: Normal * ruby -v: ruby 3.1.4p236 (2023-07-26 revision a8670865c0) [arm64-darwin22] * Backport: 3.0: UNKNOWN, 3.1: DONE, 3.2: DONE ---------------------------------------- On Ruby 3.1 & 3.2, if you have one thread blocked into a directed call to `Process.waitpid2` with a pid specified, a concurrent call to `Process.waitpid2 -1` will not be able to find & reap any other terminated child process, even one with a different pid that is not individually being waited on. I've attached a Ruby program which should terminate but doesn't as a result of this bug, as well as a C program which demonstrates that the underlying syscalls (at least on Linux) do behave how you would expect. My reproduction creates two processes; a long-running process that does not exit, and a short one which does. There is a background thread calling `Process.waitpid2` on the long process. Then, a concurrent call to `Process.waitpid2 -1` does not notice that the short-running process has exited. My `wait_bug.rb` program _does_ work properly and terminate on the current master branch of Ruby; I assume this is because all of the MJIT-related process management stuff with the waiting_pids & stuff has been cleaned up as part of the MJIT -> RJIT refactoring. Because of this, I'm not sure exactly how to make a patch; should I open a pair of PRs targeting the `ruby_3_2` and `ruby_3_1` branch? Thanks! ---Files-------------------------------- wait_bug.c (1.55 KB) wait_bug.rb (735 Bytes) -- https://bugs.ruby-lang.org/
participants (4)
-
jeremyevans0 (Jeremy Evans)
-
kjtsanaktsidis (KJ Tsanaktsidis)
-
nagachika (Tomoyuki Chikanaga)
-
usa (Usaku NAKAMURA)