[ruby-core:120897] [Ruby master Bug#21119] Programs containing `Dir.glob` with a thread executing a CPU-heavy task run very slowly.

Issue #21119 has been reported by genya0407 (Yusuke Sangenya). ---------------------------------------- Bug #21119: Programs containing `Dir.glob` with a thread executing a CPU-heavy task run very slowly. https://bugs.ruby-lang.org/issues/21119 * Author: genya0407 (Yusuke Sangenya) * Status: Open * ruby -v: ruby 3.5.0dev (2025-02-06T14:10:34Z master adbf9c5b36) +PRISM [arm64-darwin24] * Backport: 3.1: UNKNOWN, 3.2: UNKNOWN, 3.3: UNKNOWN, 3.4: UNKNOWN ---------------------------------------- Executing the following code in Ruby 3.4.1 takes a very long time, especially when there are many files \(100~\) in the current directory. This delay does not occur in Ruby 3.3.6. ## Reproducible script ```ruby # hoge.rb # Launch a thread to execute CPU-heavy task Thread.new do loop do arr = [] 100.times do arr << rand(1...100) end end end # Execute a program containing `Dir.glob` in the main thread. 10.times do Dir.glob('*') puts "aaaa" end ``` ## Execution Results Executiong the above code in Ruby 3.4.1 takes **119.43s**. ```shell $ ruby -v ruby 3.4.1 (2024-12-25 revision 48d4efcb85) +PRISM [arm64-darwin24] $ time ruby hoge.rb aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa ruby hoge.rb 119.43s user 0.30s system 99% cpu 1:59.89 total ``` Executing it in Ruby master also takes **118.87s**. ```shell $ ~/opt-ruby/bin/ruby -v ruby 3.5.0dev (2025-02-06T14:10:34Z master adbf9c5b36) +PRISM [arm64-darwin24] $ time ~/opt-ruby/bin/ruby hoge.rb aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa ~/opt-ruby/bin/ruby hoge.rb 118.87s user 0.46s system 99% cpu 2:00.45 total ``` Executing it in Ruby 3.3.6 takes only **2.22s**. ```shell $ ruby -v ruby 3.3.6 (2024-11-05 revision 75015d4c1f) [arm64-darwin24] $ time ruby hoge.rb aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa ruby hoge.rb 2.22s user 0.03s system 98% cpu 2.286 total ``` So, there are roughly **50x** delays. ## Possible Cause From Ruby 3.4.0, `Dir.glob` releases the GVL frequently. * https://bugs.ruby-lang.org/issues/20587 * https://github.com/ruby/ruby/pull/11147 Due to this change, when a CPU-heavy thread releases the GVL, `Dir.glob` also releases the GVL immediately. As a result, `Dir.glob` gets significantly delayed because it has to continuously regain the GVL causing a major slowdown in execution. ## Note about Execution Results I measured the execution results under a stress condition, with 100 files in the current directory. If there are fewer files, the slowdown may be less pronounced. -- https://bugs.ruby-lang.org/

Issue #21119 has been updated by luke-gru (Luke Gruber). This might be an issue with Kernel#loop being defined now in Ruby itself, and it never calls a primitive to check interrupts. Checking interrupts and having a timer interrupt would switch threads. ---------------------------------------- Bug #21119: Programs containing `Dir.glob` with a thread executing a CPU-heavy task run very slowly. https://bugs.ruby-lang.org/issues/21119#change-111775 * Author: genya0407 (Yusuke Sangenya) * Status: Open * ruby -v: ruby 3.5.0dev (2025-02-06T14:10:34Z master adbf9c5b36) +PRISM [arm64-darwin24] * Backport: 3.1: UNKNOWN, 3.2: UNKNOWN, 3.3: UNKNOWN, 3.4: UNKNOWN ---------------------------------------- Executing the following code in Ruby 3.4.1 takes a very long time, especially when there are many files \(100~\) in the current directory. This delay does not occur in Ruby 3.3.6. ## Reproducible script ```ruby # hoge.rb # Launch a thread to execute CPU-heavy task Thread.new do loop do arr = [] 100.times do arr << rand(1...100) end end end # Execute a program containing `Dir.glob` in the main thread. 10.times do Dir.glob('*') puts "aaaa" end ``` ## Execution Results Executing the above code in Ruby 3.4.1 takes **119.43s**. ```shell $ ruby -v ruby 3.4.1 (2024-12-25 revision 48d4efcb85) +PRISM [arm64-darwin24] $ time ruby hoge.rb aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa ruby hoge.rb 119.43s user 0.30s system 99% cpu 1:59.89 total ``` Executing it in Ruby master also takes **118.87s**. ```shell $ ~/opt-ruby/bin/ruby -v ruby 3.5.0dev (2025-02-06T14:10:34Z master adbf9c5b36) +PRISM [arm64-darwin24] $ time ~/opt-ruby/bin/ruby hoge.rb aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa ~/opt-ruby/bin/ruby hoge.rb 118.87s user 0.46s system 99% cpu 2:00.45 total ``` Executing it in Ruby 3.3.6 takes only **2.22s**. ```shell $ ruby -v ruby 3.3.6 (2024-11-05 revision 75015d4c1f) [arm64-darwin24] $ time ruby hoge.rb aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa ruby hoge.rb 2.22s user 0.03s system 98% cpu 2.286 total ``` So, there are roughly **50x** delays. ## Possible Cause From Ruby 3.4.0, `Dir.glob` releases the GVL frequently. * https://bugs.ruby-lang.org/issues/20587 * https://github.com/ruby/ruby/pull/11147 Due to this change, when a CPU-heavy thread releases the GVL, `Dir.glob` also releases the GVL immediately. As a result, `Dir.glob` gets significantly delayed because it has to continuously regain the GVL causing a major slowdown in execution. ## Note about Execution Results I measured the execution results under a stress condition, with 100 files in the current directory. If there are fewer files, the slowdown may be less pronounced. -- https://bugs.ruby-lang.org/

Issue #21119 has been updated by jeremyevans0 (Jeremy Evans). It is simple to revert the GVL-releasing, but then no other thread can run while accessing the filesystem (which may block for a long period of time for networked filesystems). GVL-releasing is a tradeoff. It mitigates damage if the filesystem access takes a long time, but it makes the common case slower. I think this issue is much more pronounced on Mac OS and other systems where `getattrlist`/`fgetattrlist` are used in order to determine whether normalization is needed, because then the GVL is released for every directory entry. I don't have any opinion on whether the tradeoff is worth it in this case. ---------------------------------------- Bug #21119: Programs containing `Dir.glob` with a thread executing a CPU-heavy task run very slowly. https://bugs.ruby-lang.org/issues/21119#change-111776 * Author: genya0407 (Yusuke Sangenya) * Status: Open * ruby -v: ruby 3.5.0dev (2025-02-06T14:10:34Z master adbf9c5b36) +PRISM [arm64-darwin24] * Backport: 3.1: UNKNOWN, 3.2: UNKNOWN, 3.3: UNKNOWN, 3.4: UNKNOWN ---------------------------------------- Executing the following code in Ruby 3.4.1 takes a very long time, especially when there are many files \(100~\) in the current directory. This delay does not occur in Ruby 3.3.6. ## Reproducible script ```ruby # hoge.rb # Launch a thread to execute CPU-heavy task Thread.new do loop do arr = [] 100.times do arr << rand(1...100) end end end # Execute a program containing `Dir.glob` in the main thread. 10.times do Dir.glob('*') puts "aaaa" end ``` ## Execution Results Executing the above code in Ruby 3.4.1 takes **119.43s**. ```shell $ ruby -v ruby 3.4.1 (2024-12-25 revision 48d4efcb85) +PRISM [arm64-darwin24] $ time ruby hoge.rb aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa ruby hoge.rb 119.43s user 0.30s system 99% cpu 1:59.89 total ``` Executing it in Ruby master also takes **118.87s**. ```shell $ ~/opt-ruby/bin/ruby -v ruby 3.5.0dev (2025-02-06T14:10:34Z master adbf9c5b36) +PRISM [arm64-darwin24] $ time ~/opt-ruby/bin/ruby hoge.rb aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa ~/opt-ruby/bin/ruby hoge.rb 118.87s user 0.46s system 99% cpu 2:00.45 total ``` Executing it in Ruby 3.3.6 takes only **2.22s**. ```shell $ ruby -v ruby 3.3.6 (2024-11-05 revision 75015d4c1f) [arm64-darwin24] $ time ruby hoge.rb aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa ruby hoge.rb 2.22s user 0.03s system 98% cpu 2.286 total ``` So, there are roughly **50x** delays. ## Possible Cause From Ruby 3.4.0, `Dir.glob` releases the GVL frequently. * https://bugs.ruby-lang.org/issues/20587 * https://github.com/ruby/ruby/pull/11147 Due to this change, when a CPU-heavy thread releases the GVL, `Dir.glob` also releases the GVL immediately. As a result, `Dir.glob` gets significantly delayed because it has to continuously regain the GVL causing a major slowdown in execution. ## Note about Execution Results I measured the execution results under a stress condition, with 100 files in the current directory. If there are fewer files, the slowdown may be less pronounced. -- https://bugs.ruby-lang.org/

Issue #21119 has been updated by byroot (Jean Boussier). I don't think we should revert the GVL freeing, but we should really start to think about a smarter scheduler that don't penalize threads that release the GVL. It's a longer project though. ---------------------------------------- Bug #21119: Programs containing `Dir.glob` with a thread executing a CPU-heavy task run very slowly. https://bugs.ruby-lang.org/issues/21119#change-111785 * Author: genya0407 (Yusuke Sangenya) * Status: Open * ruby -v: ruby 3.5.0dev (2025-02-06T14:10:34Z master adbf9c5b36) +PRISM [arm64-darwin24] * Backport: 3.1: UNKNOWN, 3.2: UNKNOWN, 3.3: UNKNOWN, 3.4: UNKNOWN ---------------------------------------- Executing the following code in Ruby 3.4.1 takes a very long time, especially when there are many files \(100~\) in the current directory. This delay does not occur in Ruby 3.3.6. ## Reproducible script ```ruby # hoge.rb # Launch a thread to execute CPU-heavy task Thread.new do loop do arr = [] 100.times do arr << rand(1...100) end end end # Execute a program containing `Dir.glob` in the main thread. 10.times do Dir.glob('*') puts "aaaa" end ``` ## Execution Results Executing the above code in Ruby 3.4.1 takes **119.43s**. ```shell $ ruby -v ruby 3.4.1 (2024-12-25 revision 48d4efcb85) +PRISM [arm64-darwin24] $ time ruby hoge.rb aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa ruby hoge.rb 119.43s user 0.30s system 99% cpu 1:59.89 total ``` Executing it in Ruby master also takes **118.87s**. ```shell $ ~/opt-ruby/bin/ruby -v ruby 3.5.0dev (2025-02-06T14:10:34Z master adbf9c5b36) +PRISM [arm64-darwin24] $ time ~/opt-ruby/bin/ruby hoge.rb aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa ~/opt-ruby/bin/ruby hoge.rb 118.87s user 0.46s system 99% cpu 2:00.45 total ``` Executing it in Ruby 3.3.6 takes only **2.22s**. ```shell $ ruby -v ruby 3.3.6 (2024-11-05 revision 75015d4c1f) [arm64-darwin24] $ time ruby hoge.rb aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa ruby hoge.rb 2.22s user 0.03s system 98% cpu 2.286 total ``` So, there are roughly **50x** delays. ## Possible Cause From Ruby 3.4.0, `Dir.glob` releases the GVL frequently. * https://bugs.ruby-lang.org/issues/20587 * https://github.com/ruby/ruby/pull/11147 Due to this change, when a CPU-heavy thread releases the GVL, `Dir.glob` also releases the GVL immediately. As a result, `Dir.glob` gets significantly delayed because it has to continuously regain the GVL causing a major slowdown in execution. ## Note about Execution Results I measured the execution results under a stress condition, with 100 files in the current directory. If there are fewer files, the slowdown may be less pronounced. -- https://bugs.ruby-lang.org/

Issue #21119 has been updated by luke-gru (Luke Gruber). Yeah sorry it is the GVL, like you guys are saying. There are many syscalls here, it would be nice to just release it at the top and get it back after all the syscalls, but then there's probably a lot of ruby functions in between the syscalls that need the GVL... I agree with @byroot that we need a smarter scheduler for these cases. And alternatively to not penalizing threads that release the GVL, we could do like Go and not release the GVL (the `P` in go parlance) on potentially short blocking syscalls and instead register the thread with a monitoring thread (maybe the timer thread?) before the syscall. That monitoring thread checks ruby threads that are in this blocking state for too long and gives the GVL to another waiting thread if it exceeds the limit. If it doesn't exceed this time limit, the ruby thread never yields. This way we could use the GVL release for calls that we know will block a while and use the optimistic no-release case for calls we think will be fast. ---------------------------------------- Bug #21119: Programs containing `Dir.glob` with a thread executing a CPU-heavy task run very slowly. https://bugs.ruby-lang.org/issues/21119#change-111786 * Author: genya0407 (Yusuke Sangenya) * Status: Open * ruby -v: ruby 3.5.0dev (2025-02-06T14:10:34Z master adbf9c5b36) +PRISM [arm64-darwin24] * Backport: 3.1: UNKNOWN, 3.2: UNKNOWN, 3.3: UNKNOWN, 3.4: UNKNOWN ---------------------------------------- Executing the following code in Ruby 3.4.1 takes a very long time, especially when there are many files \(100~\) in the current directory. This delay does not occur in Ruby 3.3.6. ## Reproducible script ```ruby # hoge.rb # Launch a thread to execute CPU-heavy task Thread.new do loop do arr = [] 100.times do arr << rand(1...100) end end end # Execute a program containing `Dir.glob` in the main thread. 10.times do Dir.glob('*') puts "aaaa" end ``` ## Execution Results Executing the above code in Ruby 3.4.1 takes **119.43s**. ```shell $ ruby -v ruby 3.4.1 (2024-12-25 revision 48d4efcb85) +PRISM [arm64-darwin24] $ time ruby hoge.rb aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa ruby hoge.rb 119.43s user 0.30s system 99% cpu 1:59.89 total ``` Executing it in Ruby master also takes **118.87s**. ```shell $ ~/opt-ruby/bin/ruby -v ruby 3.5.0dev (2025-02-06T14:10:34Z master adbf9c5b36) +PRISM [arm64-darwin24] $ time ~/opt-ruby/bin/ruby hoge.rb aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa ~/opt-ruby/bin/ruby hoge.rb 118.87s user 0.46s system 99% cpu 2:00.45 total ``` Executing it in Ruby 3.3.6 takes only **2.22s**. ```shell $ ruby -v ruby 3.3.6 (2024-11-05 revision 75015d4c1f) [arm64-darwin24] $ time ruby hoge.rb aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa ruby hoge.rb 2.22s user 0.03s system 98% cpu 2.286 total ``` So, there are roughly **50x** delays. ## Possible Cause From Ruby 3.4.0, `Dir.glob` releases the GVL frequently. * https://bugs.ruby-lang.org/issues/20587 * https://github.com/ruby/ruby/pull/11147 Due to this change, when a CPU-heavy thread releases the GVL, `Dir.glob` also releases the GVL immediately. As a result, `Dir.glob` gets significantly delayed because it has to continuously regain the GVL causing a major slowdown in execution. ## Note about Execution Results I measured the execution results under a stress condition, with 100 files in the current directory. If there are fewer files, the slowdown may be less pronounced. -- https://bugs.ruby-lang.org/

Issue #21119 has been updated by naruse (Yui NARUSE). If there is a C Dir.glob implementation, we can run it in another pthread in parallel. ---------------------------------------- Bug #21119: Programs containing `Dir.glob` with a thread executing a CPU-heavy task run very slowly. https://bugs.ruby-lang.org/issues/21119#change-111875 * Author: genya0407 (Yusuke Sangenya) * Status: Open * ruby -v: ruby 3.5.0dev (2025-02-06T14:10:34Z master adbf9c5b36) +PRISM [arm64-darwin24] * Backport: 3.1: UNKNOWN, 3.2: UNKNOWN, 3.3: UNKNOWN, 3.4: UNKNOWN ---------------------------------------- Executing the following code in Ruby 3.4.1 takes a very long time, especially when there are many files \(100~\) in the current directory. This delay does not occur in Ruby 3.3.6. ## Reproducible script ```ruby # hoge.rb # Launch a thread to execute CPU-heavy task Thread.new do loop do arr = [] 100.times do arr << rand(1...100) end end end # Execute a program containing `Dir.glob` in the main thread. 10.times do Dir.glob('*') puts "aaaa" end ``` ## Execution Results Executing the above code in Ruby 3.4.1 takes **119.43s**. ```shell $ ruby -v ruby 3.4.1 (2024-12-25 revision 48d4efcb85) +PRISM [arm64-darwin24] $ time ruby hoge.rb aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa ruby hoge.rb 119.43s user 0.30s system 99% cpu 1:59.89 total ``` Executing it in Ruby master also takes **118.87s**. ```shell $ ~/opt-ruby/bin/ruby -v ruby 3.5.0dev (2025-02-06T14:10:34Z master adbf9c5b36) +PRISM [arm64-darwin24] $ time ~/opt-ruby/bin/ruby hoge.rb aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa ~/opt-ruby/bin/ruby hoge.rb 118.87s user 0.46s system 99% cpu 2:00.45 total ``` Executing it in Ruby 3.3.6 takes only **2.22s**. ```shell $ ruby -v ruby 3.3.6 (2024-11-05 revision 75015d4c1f) [arm64-darwin24] $ time ruby hoge.rb aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa aaaa ruby hoge.rb 2.22s user 0.03s system 98% cpu 2.286 total ``` So, there are roughly **50x** delays. ## Possible Cause From Ruby 3.4.0, `Dir.glob` releases the GVL frequently. * https://bugs.ruby-lang.org/issues/20587 * https://github.com/ruby/ruby/pull/11147 Due to this change, when a CPU-heavy thread releases the GVL, `Dir.glob` also releases the GVL immediately. As a result, `Dir.glob` gets significantly delayed because it has to continuously regain the GVL causing a major slowdown in execution. ## Note about Execution Results I measured the execution results under a stress condition, with 100 files in the current directory. If there are fewer files, the slowdown may be less pronounced. -- https://bugs.ruby-lang.org/
participants (5)
-
byroot (Jean Boussier)
-
genya0407 (Yusuke Sangenya)
-
jeremyevans0 (Jeremy Evans)
-
luke-gru (Luke Gruber)
-
naruse (Yui NARUSE)