[ruby-core:112457] [Ruby master Feature#19443] Cache `Process.pid`

Issue #19443 has been reported by byroot (Jean Boussier). ---------------------------------------- Feature #19443: Cache `Process.pid` https://bugs.ruby-lang.org/issues/19443 * Author: byroot (Jean Boussier) * Status: Open * Priority: Normal ---------------------------------------- It's not uncommon for database client and similar network libraries to protect themselves from Process.fork by regularly checking Process.pid Until recently most libc would cache `getpid()` so this was a cheap check to make. However as of glibc version 2.25 the PID cache is removed and calls to `getpid()` always invoke the actual system call which significantly degrades the performance of existing applications. The reason glibc removed the cache is that some libraries were bypassing `fork(2)` by issuing system calls themselves, causing stale cache issues. That isn't a concern for Ruby as bypassing MRI's primitive for forking would render the VM unusable, so we can safely cache the PID. An example of the issue: https://github.com/rails/rails/issues/47418 -- https://bugs.ruby-lang.org/

Issue #19443 has been updated by ko1 (Koichi Sasada).
However as of glibc version 2.25 the PID cache is removed and calls to getpid() always invoke the actual system call which significantly degrades the performance of existing applications.
Could you show the some benchmark results with/without your patch. As I understand `getpid()` system call is well tuned so I surprised that there is an impact on the app. ---------------------------------------- Feature #19443: Cache `Process.pid` https://bugs.ruby-lang.org/issues/19443#change-101904 * Author: byroot (Jean Boussier) * Status: Open * Priority: Normal ---------------------------------------- It's not uncommon for database client and similar network libraries to protect themselves from Process.fork by regularly checking Process.pid Until recently most libc would cache `getpid()` so this was a cheap check to make. However as of glibc version 2.25 the PID cache is removed and calls to `getpid()` always invoke the actual system call which significantly degrades the performance of existing applications. The reason glibc removed the cache is that some libraries were bypassing `fork(2)` by issuing system calls themselves, causing stale cache issues. That isn't a concern for Ruby as bypassing MRI's primitive for forking would render the VM unusable, so we can safely cache the PID. An example of the issue: https://github.com/rails/rails/issues/47418 Patch: https://github.com/ruby/ruby/pull/7326 -- https://bugs.ruby-lang.org/

"ko1 (Koichi Sasada) via ruby-core" <ruby-core@ml.ruby-lang.org> wrote:
As I understand `getpid()` system call is well tuned so I surprised that there is an impact on the app.
It's not whether or not a system call is expensive or not, it's the fact a system call needs to be made at all. With modern CPU vulnerability mitigations, all system calls got more expensive. Perhaps Linux vDSO mechanism can be extended to support getpid as it does gettimeofday/clock_gettime Anyways, caching getpid() is much appreciated.

Issue #19443 has been updated by byroot (Jean Boussier). ```ruby require 'benchmark/ips' Benchmark.ips do |x| x.report("Process.pid") { Process.pid } end ``` On macOS where `getpid()` is still cached: ``` ruby 3.2.0 (2022-12-25 revision a528908271) [arm64-darwin22] Warming up -------------------------------------- Process.pid 1.879M i/100ms Calculating ------------------------------------- Process.pid 18.682M (± 2.1%) i/s - 93.968M in 5.032405s ``` On the same machine, but using the docker `ruby:3.2` image (glibc based) ``` ruby 3.2.0 (2022-12-25 revision a528908271) [aarch64-linux] Warming up -------------------------------------- Process.pid 356.920k i/100ms Calculating ------------------------------------- Process.pid 3.539M (± 1.3%) i/s - 17.846M in 5.042975s ``` My branch on macOS: ``` ruby 3.3.0dev (2023-02-16T18:42:31Z cache-process-pid 0cd4797132) [arm64-darwin22] Warming up -------------------------------------- Process.pid 1.804M i/100ms Calculating ------------------------------------- Process.pid 18.465M (± 1.3%) i/s - 93.812M in 5.081288s ``` I'll try to build that branch in a docker container to benchmark it on glibc, but given the implementation I expect the same performance. ---------------------------------------- Feature #19443: Cache `Process.pid` https://bugs.ruby-lang.org/issues/19443#change-101908 * Author: byroot (Jean Boussier) * Status: Open * Priority: Normal ---------------------------------------- It's not uncommon for database client and similar network libraries to protect themselves from Process.fork by regularly checking Process.pid Until recently most libc would cache `getpid()` so this was a cheap check to make. However as of glibc version 2.25 the PID cache is removed and calls to `getpid()` always invoke the actual system call which significantly degrades the performance of existing applications. The reason glibc removed the cache is that some libraries were bypassing `fork(2)` by issuing system calls themselves, causing stale cache issues. That isn't a concern for Ruby as bypassing MRI's primitive for forking would render the VM unusable, so we can safely cache the PID. An example of the issue: https://github.com/rails/rails/issues/47418 Patch: https://github.com/ruby/ruby/pull/7326 -- https://bugs.ruby-lang.org/

Issue #19443 has been updated by byroot (Jean Boussier). Here, from my branch built in a Ubuntu jammy (22.04) based image: ``` ruby 3.3.0dev (2023-02-16T18:42:31Z cache-process-pid 0cd4797132) [aarch64-linux] Warming up -------------------------------------- Process.pid 1.848M i/100ms Calculating ------------------------------------- Process.pid 18.561M (± 1.5%) i/s - 94.245M in 5.078802s ``` So it's a bit over 5x faster. ---------------------------------------- Feature #19443: Cache `Process.pid` https://bugs.ruby-lang.org/issues/19443#change-101909 * Author: byroot (Jean Boussier) * Status: Open * Priority: Normal ---------------------------------------- It's not uncommon for database client and similar network libraries to protect themselves from Process.fork by regularly checking Process.pid Until recently most libc would cache `getpid()` so this was a cheap check to make. However as of glibc version 2.25 the PID cache is removed and calls to `getpid()` always invoke the actual system call which significantly degrades the performance of existing applications. The reason glibc removed the cache is that some libraries were bypassing `fork(2)` by issuing system calls themselves, causing stale cache issues. That isn't a concern for Ruby as bypassing MRI's primitive for forking would render the VM unusable, so we can safely cache the PID. An example of the issue: https://github.com/rails/rails/issues/47418 Patch: https://github.com/ruby/ruby/pull/7326 -- https://bugs.ruby-lang.org/

Issue #19443 has been updated by akr (Akira Tanaka). I think detecting fork using PID is not a good idea. PID can conflict because PID is recycled. We can define Process.fork_level as follows. ``` % ruby -e ' class << Process attr_accessor :fork_level end Process.fork_level = 0 module ForkLevel def _fork pid = super Process.fork_level += 1 if pid == 0 pid end end class << Process; self end.prepend ForkLevel puts "parent_fork_level: #{Process.fork_level}" Process.wait(fork { puts "child_fork_level: #{Process.fork_level}" }) ' parent_fork_level: 0 child_fork_level: 1 ``` fork can be detected by comparing the result of Process.fork_level. This doesn't use PID (and getpid system call). So, it has no overhead by getpid and no problem with PID recycling. ---------------------------------------- Feature #19443: Cache `Process.pid` https://bugs.ruby-lang.org/issues/19443#change-101916 * Author: byroot (Jean Boussier) * Status: Open * Priority: Normal ---------------------------------------- It's not uncommon for database client and similar network libraries to protect themselves from Process.fork by regularly checking Process.pid Until recently most libc would cache `getpid()` so this was a cheap check to make. However as of glibc version 2.25 the PID cache is removed and calls to `getpid()` always invoke the actual system call which significantly degrades the performance of existing applications. The reason glibc removed the cache is that some libraries were bypassing `fork(2)` by issuing system calls themselves, causing stale cache issues. That isn't a concern for Ruby as bypassing MRI's primitive for forking would render the VM unusable, so we can safely cache the PID. An example of the issue: https://github.com/rails/rails/issues/47418 Patch: https://github.com/ruby/ruby/pull/7326 -- https://bugs.ruby-lang.org/

Issue #19443 has been updated by byroot (Jean Boussier).
PID can conflict because PID is recycled.
I don't think it's a big concern for this use case, even with PID recycling, the PID of the child can't possibly be the same than the parent. So unless you fork several time without ever triggering the check, you can't possibly be by this.
We can define Process.fork_level as follows.
Yes, on Ruby 3.1+ we can decorate `Process._fork` for that purpose. I already submitted PRs to major libraries to do that when possible, however: - It doesn't work for `Process.daemonize` (not a big deal, but still) - There is a long tail of existing code doing this, and fixing it all may take a very long time. Also, regardless of what `Process.pid` is used for, if we can make it 5x faster with extremely little code, and as far as I can tell no downsides, why shouldn't we? ---------------------------------------- Feature #19443: Cache `Process.pid` https://bugs.ruby-lang.org/issues/19443#change-101918 * Author: byroot (Jean Boussier) * Status: Open * Priority: Normal ---------------------------------------- It's not uncommon for database client and similar network libraries to protect themselves from Process.fork by regularly checking Process.pid Until recently most libc would cache `getpid()` so this was a cheap check to make. However as of glibc version 2.25 the PID cache is removed and calls to `getpid()` always invoke the actual system call which significantly degrades the performance of existing applications. The reason glibc removed the cache is that some libraries were bypassing `fork(2)` by issuing system calls themselves, causing stale cache issues. That isn't a concern for Ruby as bypassing MRI's primitive for forking would render the VM unusable, so we can safely cache the PID. An example of the issue: https://github.com/rails/rails/issues/47418 Patch: https://github.com/ruby/ruby/pull/7326 -- https://bugs.ruby-lang.org/

Issue #19443 has been updated by byroot (Jean Boussier). I deployed a ruby shim of this cache to half of our servers: https://github.com/Shopify/pid_cache Average latency: -2ms Median latency: -2ms p75 latency: -2ms p99 latency: -10ms p99.9: -30ms ---------------------------------------- Feature #19443: Cache `Process.pid` https://bugs.ruby-lang.org/issues/19443#change-101964 * Author: byroot (Jean Boussier) * Status: Open * Priority: Normal ---------------------------------------- It's not uncommon for database client and similar network libraries to protect themselves from Process.fork by regularly checking Process.pid Until recently most libc would cache `getpid()` so this was a cheap check to make. However as of glibc version 2.25 the PID cache is removed and calls to `getpid()` always invoke the actual system call which significantly degrades the performance of existing applications. The reason glibc removed the cache is that some libraries were bypassing `fork(2)` by issuing system calls themselves, causing stale cache issues. That isn't a concern for Ruby as bypassing MRI's primitive for forking would render the VM unusable, so we can safely cache the PID. An example of the issue: https://github.com/rails/rails/issues/47418 Patch: https://github.com/ruby/ruby/pull/7326 -- https://bugs.ruby-lang.org/

Issue #19443 has been updated by ko1 (Koichi Sasada). Thank you. How to read #7 results? ---------------------------------------- Feature #19443: Cache `Process.pid` https://bugs.ruby-lang.org/issues/19443#change-101977 * Author: byroot (Jean Boussier) * Status: Open * Priority: Normal ---------------------------------------- It's not uncommon for database client and similar network libraries to protect themselves from Process.fork by regularly checking Process.pid Until recently most libc would cache `getpid()` so this was a cheap check to make. However as of glibc version 2.25 the PID cache is removed and calls to `getpid()` always invoke the actual system call which significantly degrades the performance of existing applications. The reason glibc removed the cache is that some libraries were bypassing `fork(2)` by issuing system calls themselves, causing stale cache issues. That isn't a concern for Ruby as bypassing MRI's primitive for forking would render the VM unusable, so we can safely cache the PID. An example of the issue: https://github.com/rails/rails/issues/47418 Patch: https://github.com/ruby/ruby/pull/7326 -- https://bugs.ruby-lang.org/

Issue #19443 has been updated by byroot (Jean Boussier).
How to read comment #7 results?
It's a flat reduction on our latency (server response time) metrics. On average, with the pid_cache shim, our server response time is 2 milliseconds faster. Also to note, we're still seeing quite a lot of `getpid()` syscalls coming from dependencies using `$$`, and from some C extensions. So hopefully https://github.com/ruby/ruby/pull/7326 would be even more effective. ---------------------------------------- Feature #19443: Cache `Process.pid` https://bugs.ruby-lang.org/issues/19443#change-101981 * Author: byroot (Jean Boussier) * Status: Open * Priority: Normal ---------------------------------------- It's not uncommon for database client and similar network libraries to protect themselves from Process.fork by regularly checking Process.pid Until recently most libc would cache `getpid()` so this was a cheap check to make. However as of glibc version 2.25 the PID cache is removed and calls to `getpid()` always invoke the actual system call which significantly degrades the performance of existing applications. The reason glibc removed the cache is that some libraries were bypassing `fork(2)` by issuing system calls themselves, causing stale cache issues. That isn't a concern for Ruby as bypassing MRI's primitive for forking would render the VM unusable, so we can safely cache the PID. An example of the issue: https://github.com/rails/rails/issues/47418 Patch: https://github.com/ruby/ruby/pull/7326 -- https://bugs.ruby-lang.org/

Issue #19443 has been updated by byroot (Jean Boussier). @dalehamel noticed via tracing that we're also calling `getpid()` quite a lot in the thread scheduler. I think in that case we can simply use `GET_VM()->fork_gen`, so I prepared a second patch for that https://github.com/ruby/ruby/pull/7434 ---------------------------------------- Feature #19443: Cache `Process.pid` https://bugs.ruby-lang.org/issues/19443#change-102124 * Author: byroot (Jean Boussier) * Status: Open * Priority: Normal ---------------------------------------- It's not uncommon for database client and similar network libraries to protect themselves from Process.fork by regularly checking Process.pid Until recently most libc would cache `getpid()` so this was a cheap check to make. However as of glibc version 2.25 the PID cache is removed and calls to `getpid()` always invoke the actual system call which significantly degrades the performance of existing applications. The reason glibc removed the cache is that some libraries were bypassing `fork(2)` by issuing system calls themselves, causing stale cache issues. That isn't a concern for Ruby as bypassing MRI's primitive for forking would render the VM unusable, so we can safely cache the PID. An example of the issue: https://github.com/rails/rails/issues/47418 Patch: https://github.com/ruby/ruby/pull/7326 -- https://bugs.ruby-lang.org/

Issue #19443 has been updated by ko1 (Koichi Sasada). byroot (Jean Boussier) wrote in #note-9:
How to read comment #7 results?
It's a flat reduction on our latency (server response time) metrics.
On average, with the pid_cache shim, our server response time is 2 milliseconds faster.
Thank you. BTW it is easy to understand how 2ms has impact or not by showing the measured values of before/after. Anyway, I agree that is valuable improvements. Anonymous wrote in #note-10:
With modern CPU vulnerability mitigations, all system calls got more expensive. Perhaps Linux vDSO mechanism can be extended to support getpid as it does gettimeofday/clock_gettime
I see, especially on virtualization techniques. ---------------------------------------- Feature #19443: Cache `Process.pid` https://bugs.ruby-lang.org/issues/19443#change-102125 * Author: byroot (Jean Boussier) * Status: Open * Priority: Normal ---------------------------------------- It's not uncommon for database client and similar network libraries to protect themselves from Process.fork by regularly checking Process.pid Until recently most libc would cache `getpid()` so this was a cheap check to make. However as of glibc version 2.25 the PID cache is removed and calls to `getpid()` always invoke the actual system call which significantly degrades the performance of existing applications. The reason glibc removed the cache is that some libraries were bypassing `fork(2)` by issuing system calls themselves, causing stale cache issues. That isn't a concern for Ruby as bypassing MRI's primitive for forking would render the VM unusable, so we can safely cache the PID. An example of the issue: https://github.com/rails/rails/issues/47418 Patch: https://github.com/ruby/ruby/pull/7326 -- https://bugs.ruby-lang.org/

Issue #19443 has been updated by byroot (Jean Boussier).
it is easy to understand how 2ms has impact or not by showing the measured values of before/after.
Yes, unfortunately, being a public company, we have all these rules about material informations and such, so I wasn't sure what I could share exactly... But I realize it makes it harder to understand, sorry :/ ---------------------------------------- Feature #19443: Cache `Process.pid` https://bugs.ruby-lang.org/issues/19443#change-102133 * Author: byroot (Jean Boussier) * Status: Open * Priority: Normal ---------------------------------------- It's not uncommon for database client and similar network libraries to protect themselves from Process.fork by regularly checking Process.pid Until recently most libc would cache `getpid()` so this was a cheap check to make. However as of glibc version 2.25 the PID cache is removed and calls to `getpid()` always invoke the actual system call which significantly degrades the performance of existing applications. The reason glibc removed the cache is that some libraries were bypassing `fork(2)` by issuing system calls themselves, causing stale cache issues. That isn't a concern for Ruby as bypassing MRI's primitive for forking would render the VM unusable, so we can safely cache the PID. An example of the issue: https://github.com/rails/rails/issues/47418 Patch: https://github.com/ruby/ruby/pull/7326 -- https://bugs.ruby-lang.org/

Issue #19443 has been updated by byroot (Jean Boussier). Relaying here what Javier Honduvilla Coto said on one of the PRs:
wondering if it would be possible/make sense to override libc's getpid with a custom implementation that does the caching in there. That way not only Process.getpid would use the faster method but also any other part of the runtime such as what @dalehamel mentioned above or any other getpid calls from native libraries?
I don't know if it's a good idea or not, as far as I know it would be a first for ruby to override a symbol defined by libc, it doesn't do so with `malloc` and `free` for instance. ---------------------------------------- Feature #19443: Cache `Process.pid` https://bugs.ruby-lang.org/issues/19443#change-102203 * Author: byroot (Jean Boussier) * Status: Open * Priority: Normal ---------------------------------------- It's not uncommon for database client and similar network libraries to protect themselves from Process.fork by regularly checking Process.pid Until recently most libc would cache `getpid()` so this was a cheap check to make. However as of glibc version 2.25 the PID cache is removed and calls to `getpid()` always invoke the actual system call which significantly degrades the performance of existing applications. The reason glibc removed the cache is that some libraries were bypassing `fork(2)` by issuing system calls themselves, causing stale cache issues. That isn't a concern for Ruby as bypassing MRI's primitive for forking would render the VM unusable, so we can safely cache the PID. An example of the issue: https://github.com/rails/rails/issues/47418 Patch: https://github.com/ruby/ruby/pull/7326 -- https://bugs.ruby-lang.org/

Issue #19443 has been updated by matz (Yukihiro Matsumoto). Caching process id sounds OK for me. Go ahead. Matz. ---------------------------------------- Feature #19443: Cache `Process.pid` https://bugs.ruby-lang.org/issues/19443#change-102472 * Author: byroot (Jean Boussier) * Status: Open * Priority: Normal ---------------------------------------- It's not uncommon for database client and similar network libraries to protect themselves from Process.fork by regularly checking Process.pid Until recently most libc would cache `getpid()` so this was a cheap check to make. However as of glibc version 2.25 the PID cache is removed and calls to `getpid()` always invoke the actual system call which significantly degrades the performance of existing applications. The reason glibc removed the cache is that some libraries were bypassing `fork(2)` by issuing system calls themselves, causing stale cache issues. That isn't a concern for Ruby as bypassing MRI's primitive for forking would render the VM unusable, so we can safely cache the PID. An example of the issue: https://github.com/rails/rails/issues/47418 Patch: https://github.com/ruby/ruby/pull/7326 -- https://bugs.ruby-lang.org/

Issue #19443 has been updated by byroot (Jean Boussier). Thank you Matz! I merged the caching of `Process.pid` and `$$`. The thread scheduler still call `getpid()` a lot, and I'll try to eliminate that in a follow-up (even though according to @ko1 most of that code will be replaced before 3.3). ---------------------------------------- Feature #19443: Cache `Process.pid` https://bugs.ruby-lang.org/issues/19443#change-102475 * Author: byroot (Jean Boussier) * Status: Closed * Priority: Normal ---------------------------------------- It's not uncommon for database client and similar network libraries to protect themselves from Process.fork by regularly checking Process.pid Until recently most libc would cache `getpid()` so this was a cheap check to make. However as of glibc version 2.25 the PID cache is removed and calls to `getpid()` always invoke the actual system call which significantly degrades the performance of existing applications. The reason glibc removed the cache is that some libraries were bypassing `fork(2)` by issuing system calls themselves, causing stale cache issues. That isn't a concern for Ruby as bypassing MRI's primitive for forking would render the VM unusable, so we can safely cache the PID. An example of the issue: https://github.com/rails/rails/issues/47418 Patch: https://github.com/ruby/ruby/pull/7326 -- https://bugs.ruby-lang.org/
participants (7)
-
akr (Akira Tanaka)
-
byroot (Jean Boussier)
-
byroot (Jean Boussier)
-
Eric Wong
-
ko1 (Koichi Sasada)
-
ko1 (Koichi Sasada)
-
matz (Yukihiro Matsumoto)