[ruby-core:112045] [Ruby master Bug#19378] Windows: Use less syscalls for faster require of big gems

Issue #19378 has been reported by aidog (Andi Idogawa). ---------------------------------------- Bug #19378: Windows: Use less syscalls for faster require of big gems https://bugs.ruby-lang.org/issues/19378 * Author: aidog (Andi Idogawa) * Status: Open * Priority: Normal * ruby -v: 3.2.0 * Backport: 2.7: UNKNOWN, 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN ---------------------------------------- Hello 🙂 ## Problem require is slow on windows for big gems. (example: require 'gtk3'=> 3 seconds+). This is a problem for people who want to make cross platform GUI apps with ruby. ## Possible Reason As touched on in [#15797](https://bugs.ruby-lang.org/issues/15797) it seems like require uses realpath, which is emulated on windows. It checks every parent directory. The same syscalls run many times. ## Testfile C:\tmp\speedtest\testrequire.rb: ``` ruby require __dir__ + "/helloworld1.rb" require __dir__ + "/helloworld2.rb" ``` ``` shell ruby --disable-gems C:\tmp\speedtest\testrequire.rb ``` ### Syscalls per File/Directory: 1. CreateFile 2. QueryInformationVolume 3. QueryIdInformation 4. QueryAllInformationFile 5. QueryNameInformationFile 6. QueryNameInformationFile 7. QueryNormalizedNameInformationFile 8. CloseFile ### Files/Directories checked 1. C:\tmp 2. C:\tmp\speedtest 3. C:\tmp\speedtest\helloworld1.rb 4. C:\tmp 5. C:\tmp\speedtest 6. C:\tmp\speedtest\helloworld2.rb For two required files Ruby had to do 8*6 = **48** syscalls. The syscalls orginate from rb_w32_reparse_symlink_p / lstat Rubygems live in subfolders with 9+ parts: "C:\Ruby32-x64\lib\ruby\gems\3.2.0\gems\glib2-4.0.8\lib\glib2\variant.rb" Each file takes 8 * 9 = **72**+ calls. For variant.rb it is **80** calls. The result for the syscalls don't change in such a short time, so it should be possible to cache it. With require_relative it's twice as many calls. ## Other testcases Same result: ``` ruby File.realpath __dir__ + "/helloworld1.rb" File.realpath __dir__ + "/helloworld2.rb" ``` ``` ruby File.stat __dir__ + "/helloworld1.rb" File.stat __dir__ + "/helloworld2.rb" ``` It does not happen in $LOAD_PATH.resolve_feature_path(__dir__ + "/helloworld1.rb") ## Request Would it be possible to cache the stat calls when using require? I tried to implement a cache inside the ruby source code, but failed. If not, is there now a way to combine ruby files into one? I previously talked about require here: [YJIT: Windows support lacking.](https://bugs.ruby-lang.org/issues/19325#note-11) ## How to reproduce Ruby versions: At least 3.0+, most likely older ones too. Tested using Ruby Installer 3.1 and 3.2. [Procmon Software by Sysinternals](https://learn.microsoft.com/en-us/sysinternals/downloads/procmon) -- https://bugs.ruby-lang.org/

Issue #19378 has been updated by aidog (Andi Idogawa). File windows-no-realpath-require.patch added Thanks to the new windows build docs by ioquatix, I made a test patch to check how much faster it would be if some of the repeated syscalls on the folders (c:/tmp/, c:/tmp/speedtest, gems and so on) are avoided: tzinfo: 0.8s to 0.3s gtk3: 2.8s to 2.5s (I see another similar issue inside the gem C code) ---------------------------------------- Bug #19378: Windows: Use less syscalls for faster require of big gems https://bugs.ruby-lang.org/issues/19378#change-101547 * Author: aidog (Andi Idogawa) * Status: Open * Priority: Normal * ruby -v: 3.2.0 * Backport: 2.7: UNKNOWN, 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN ---------------------------------------- Hello 🙂 ## Problem require is slow on windows for big gems. (example: require 'gtk3'=> 3 seconds+). This is a problem for people who want to make cross platform GUI apps with ruby. ## Possible Reason As touched on in [#15797](https://bugs.ruby-lang.org/issues/15797) it seems like require uses realpath, which is emulated on windows. It checks every parent directory. The same syscalls run many times. ## Testfile C:\tmp\speedtest\testrequire.rb: ``` ruby require __dir__ + "/helloworld1.rb" require __dir__ + "/helloworld2.rb" ``` ``` shell ruby --disable-gems C:\tmp\speedtest\testrequire.rb ``` ### Syscalls per File/Directory: 1. CreateFile 2. QueryInformationVolume 3. QueryIdInformation 4. QueryAllInformationFile 5. QueryNameInformationFile 6. QueryNameInformationFile 7. QueryNormalizedNameInformationFile 8. CloseFile ### Files/Directories checked 1. C:\tmp 2. C:\tmp\speedtest 3. C:\tmp\speedtest\helloworld1.rb 4. C:\tmp 5. C:\tmp\speedtest 6. C:\tmp\speedtest\helloworld2.rb For two required files Ruby had to do 8*6 = **48** syscalls. The syscalls orginate from rb_w32_reparse_symlink_p / lstat Rubygems live in subfolders with 9+ parts: "C:\Ruby32-x64\lib\ruby\gems\3.2.0\gems\glib2-4.0.8\lib\glib2\variant.rb" Each file takes 8 * 9 = **72**+ calls. For variant.rb it is **80** calls. The result for the syscalls don't change in such a short time, so it should be possible to cache it. With require_relative it's twice as many calls. ## Other testcases Same result: ``` ruby File.realpath __dir__ + "/helloworld1.rb" File.realpath __dir__ + "/helloworld2.rb" ``` ``` ruby File.stat __dir__ + "/helloworld1.rb" File.stat __dir__ + "/helloworld2.rb" ``` It does not happen in $LOAD_PATH.resolve_feature_path(__dir__ + "/helloworld1.rb") ## Request Would it be possible to cache the stat calls when using require? I tried to implement a cache inside the ruby source code, but failed. If not, is there now a way to combine ruby files into one? I previously talked about require here: [YJIT: Windows support lacking.](https://bugs.ruby-lang.org/issues/19325#note-11) ## How to reproduce Ruby versions: At least 3.0+, most likely older ones too. Tested using Ruby Installer 3.1 and 3.2. [Procmon Software by Sysinternals](https://learn.microsoft.com/en-us/sysinternals/downloads/procmon) ---Files-------------------------------- windows-no-realpath-require.patch (992 Bytes) -- https://bugs.ruby-lang.org/

Issue #19378 has been updated by nobu (Nobuyoshi Nakada). Status changed from Open to Assigned Assignee set to windows ---------------------------------------- Bug #19378: Windows: Use less syscalls for faster require of big gems https://bugs.ruby-lang.org/issues/19378#change-101686 * Author: aidog (Andi Idogawa) * Status: Assigned * Priority: Normal * Assignee: windows * ruby -v: 3.2.0 * Backport: 2.7: UNKNOWN, 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN ---------------------------------------- Hello 🙂 ## Problem require is slow on windows for big gems. (example: require 'gtk3'=> 3 seconds+). This is a problem for people who want to make cross platform GUI apps with ruby. ## Possible Reason As touched on in [#15797](https://bugs.ruby-lang.org/issues/15797) it seems like require uses realpath, which is emulated on windows. It checks every parent directory. The same syscalls run many times. ## Testfile C:\tmp\speedtest\testrequire.rb: ``` ruby require __dir__ + "/helloworld1.rb" require __dir__ + "/helloworld2.rb" ``` ``` shell ruby --disable-gems C:\tmp\speedtest\testrequire.rb ``` ### Syscalls per File/Directory: 1. CreateFile 2. QueryInformationVolume 3. QueryIdInformation 4. QueryAllInformationFile 5. QueryNameInformationFile 6. QueryNameInformationFile 7. QueryNormalizedNameInformationFile 8. CloseFile ### Files/Directories checked 1. C:\tmp 2. C:\tmp\speedtest 3. C:\tmp\speedtest\helloworld1.rb 4. C:\tmp 5. C:\tmp\speedtest 6. C:\tmp\speedtest\helloworld2.rb For two required files Ruby had to do 8*6 = **48** syscalls. The syscalls orginate from rb_w32_reparse_symlink_p / lstat Rubygems live in subfolders with 9+ parts: "C:\Ruby32-x64\lib\ruby\gems\3.2.0\gems\glib2-4.0.8\lib\glib2\variant.rb" Each file takes 8 * 9 = **72**+ calls. For variant.rb it is **80** calls. The result for the syscalls don't change in such a short time, so it should be possible to cache it. With require_relative it's twice as many calls. ## Other testcases Same result: ``` ruby File.realpath __dir__ + "/helloworld1.rb" File.realpath __dir__ + "/helloworld2.rb" ``` ``` ruby File.stat __dir__ + "/helloworld1.rb" File.stat __dir__ + "/helloworld2.rb" ``` It does not happen in $LOAD_PATH.resolve_feature_path(__dir__ + "/helloworld1.rb") ## Request Would it be possible to cache the stat calls when using require? I tried to implement a cache inside the ruby source code, but failed. If not, is there now a way to combine ruby files into one? I previously talked about require here: [YJIT: Windows support lacking.](https://bugs.ruby-lang.org/issues/19325#note-11) ## How to reproduce Ruby versions: At least 3.0+, most likely older ones too. Tested using Ruby Installer 3.1 and 3.2. [Procmon Software by Sysinternals](https://learn.microsoft.com/en-us/sysinternals/downloads/procmon) ---Files-------------------------------- windows-no-realpath-require.patch (992 Bytes) -- https://bugs.ruby-lang.org/

Issue #19378 has been updated by joshc (Josh C). File windows-revert-79a4484a.patch added I've also noticed a significant increase in file IO events (as reported by procmon) due to https://github.com/ruby/ruby/commit/79a4484a072e9769b603e7b4fbdb15b1d7eccb15 introduced in Ruby 3.1.0. The code tries to prevent the same file from being loaded twice by calling `rb_realpath_internal` to see if the realpath has already been loaded. This is a problem on systems like Windows that use Ruby's emulated realpath, especially when there are deeply nested directories. I've attached a revert patch. It'd be great to use GetFinalPathNameByHandleW and avoid the emulate code. ---------------------------------------- Bug #19378: Windows: Use less syscalls for faster require of big gems https://bugs.ruby-lang.org/issues/19378#change-102015 * Author: aidog (Andi Idogawa) * Status: Assigned * Priority: Normal * Assignee: windows * ruby -v: 3.2.0 * Backport: 2.7: UNKNOWN, 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN ---------------------------------------- Hello 🙂 ## Problem require is slow on windows for big gems. (example: require 'gtk3'=> 3 seconds+). This is a problem for people who want to make cross platform GUI apps with ruby. ## Possible Reason As touched on in [#15797](https://bugs.ruby-lang.org/issues/15797) it seems like require uses realpath, which is emulated on windows. It checks every parent directory. The same syscalls run many times. ## Testfile C:\tmp\speedtest\testrequire.rb: ``` ruby require __dir__ + "/helloworld1.rb" require __dir__ + "/helloworld2.rb" ``` ``` shell ruby --disable-gems C:\tmp\speedtest\testrequire.rb ``` ### Syscalls per File/Directory: 1. CreateFile 2. QueryInformationVolume 3. QueryIdInformation 4. QueryAllInformationFile 5. QueryNameInformationFile 6. QueryNameInformationFile 7. QueryNormalizedNameInformationFile 8. CloseFile ### Files/Directories checked 1. C:\tmp 2. C:\tmp\speedtest 3. C:\tmp\speedtest\helloworld1.rb 4. C:\tmp 5. C:\tmp\speedtest 6. C:\tmp\speedtest\helloworld2.rb For two required files Ruby had to do 8*6 = **48** syscalls. The syscalls orginate from rb_w32_reparse_symlink_p / lstat Rubygems live in subfolders with 9+ parts: "C:\Ruby32-x64\lib\ruby\gems\3.2.0\gems\glib2-4.0.8\lib\glib2\variant.rb" Each file takes 8 * 9 = **72**+ calls. For variant.rb it is **80** calls. The result for the syscalls don't change in such a short time, so it should be possible to cache it. With require_relative it's twice as many calls. ## Other testcases Same result: ``` ruby File.realpath __dir__ + "/helloworld1.rb" File.realpath __dir__ + "/helloworld2.rb" ``` ``` ruby File.stat __dir__ + "/helloworld1.rb" File.stat __dir__ + "/helloworld2.rb" ``` It does not happen in $LOAD_PATH.resolve_feature_path(__dir__ + "/helloworld1.rb") ## Request Would it be possible to cache the stat calls when using require? I tried to implement a cache inside the ruby source code, but failed. If not, is there now a way to combine ruby files into one? I previously talked about require here: [YJIT: Windows support lacking.](https://bugs.ruby-lang.org/issues/19325#note-11) ## How to reproduce Ruby versions: At least 3.0+, most likely older ones too. Tested using Ruby Installer 3.1 and 3.2. [Procmon Software by Sysinternals](https://learn.microsoft.com/en-us/sysinternals/downloads/procmon) ---Files-------------------------------- windows-no-realpath-require.patch (992 Bytes) windows-revert-79a4484a.patch (5.42 KB) -- https://bugs.ruby-lang.org/

Issue #19378 has been updated by jeremyevans0 (Jeremy Evans). joshc (Josh C) wrote in #note-3:
I've attached a revert patch.
I think the only way we would revert commit:79a4484a072e9769b603e7b4fbdb15b1d7eccb15 is if someone can come up with an alternative approach to fixing Bug #17885.
It'd be great to use GetFinalPathNameByHandleW and avoid the emulate code.
If you mean to use this on Windows for the internals of File#realpath, I think we would be open to a backwards compatible patch for that, but @usa would need to decide as he maintains the mswin64 platform. ---------------------------------------- Bug #19378: Windows: Use less syscalls for faster require of big gems https://bugs.ruby-lang.org/issues/19378#change-102016 * Author: aidog (Andi Idogawa) * Status: Assigned * Priority: Normal * Assignee: windows * ruby -v: 3.2.0 * Backport: 2.7: UNKNOWN, 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN ---------------------------------------- Hello 🙂 ## Problem require is slow on windows for big gems. (example: require 'gtk3'=> 3 seconds+). This is a problem for people who want to make cross platform GUI apps with ruby. ## Possible Reason As touched on in [#15797](https://bugs.ruby-lang.org/issues/15797) it seems like require uses realpath, which is emulated on windows. It checks every parent directory. The same syscalls run many times. ## Testfile C:\tmp\speedtest\testrequire.rb: ``` ruby require __dir__ + "/helloworld1.rb" require __dir__ + "/helloworld2.rb" ``` ``` shell ruby --disable-gems C:\tmp\speedtest\testrequire.rb ``` ### Syscalls per File/Directory: 1. CreateFile 2. QueryInformationVolume 3. QueryIdInformation 4. QueryAllInformationFile 5. QueryNameInformationFile 6. QueryNameInformationFile 7. QueryNormalizedNameInformationFile 8. CloseFile ### Files/Directories checked 1. C:\tmp 2. C:\tmp\speedtest 3. C:\tmp\speedtest\helloworld1.rb 4. C:\tmp 5. C:\tmp\speedtest 6. C:\tmp\speedtest\helloworld2.rb For two required files Ruby had to do 8*6 = **48** syscalls. The syscalls orginate from rb_w32_reparse_symlink_p / lstat Rubygems live in subfolders with 9+ parts: "C:\Ruby32-x64\lib\ruby\gems\3.2.0\gems\glib2-4.0.8\lib\glib2\variant.rb" Each file takes 8 * 9 = **72**+ calls. For variant.rb it is **80** calls. The result for the syscalls don't change in such a short time, so it should be possible to cache it. With require_relative it's twice as many calls. ## Other testcases Same result: ``` ruby File.realpath __dir__ + "/helloworld1.rb" File.realpath __dir__ + "/helloworld2.rb" ``` ``` ruby File.stat __dir__ + "/helloworld1.rb" File.stat __dir__ + "/helloworld2.rb" ``` It does not happen in $LOAD_PATH.resolve_feature_path(__dir__ + "/helloworld1.rb") ## Request Would it be possible to cache the stat calls when using require? I tried to implement a cache inside the ruby source code, but failed. If not, is there now a way to combine ruby files into one? I previously talked about require here: [YJIT: Windows support lacking.](https://bugs.ruby-lang.org/issues/19325#note-11) ## How to reproduce Ruby versions: At least 3.0+, most likely older ones too. Tested using Ruby Installer 3.1 and 3.2. [Procmon Software by Sysinternals](https://learn.microsoft.com/en-us/sysinternals/downloads/procmon) ---Files-------------------------------- windows-no-realpath-require.patch (992 Bytes) windows-revert-79a4484a.patch (5.42 KB) -- https://bugs.ruby-lang.org/

Issue #19378 has been updated by MSP-Greg (Greg L). Code using `GetFinalPathNameByHandleW` already exists in win32/win32.c, see https://github.com/ruby/ruby/blob/c43fbe4ebd2b519601f0b90ca98fa096799d3846/w... For cross-reference, see also [Bug #19246 'Rebuilding the loaded feature index much slower in Ruby 3.1'](https://bugs.ruby-lang.org/issues/19246) ---------------------------------------- Bug #19378: Windows: Use less syscalls for faster require of big gems https://bugs.ruby-lang.org/issues/19378#change-102083 * Author: aidog (Andi Idogawa) * Status: Assigned * Priority: Normal * Assignee: windows * ruby -v: 3.2.0 * Backport: 2.7: UNKNOWN, 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN ---------------------------------------- Hello 🙂 ## Problem require is slow on windows for big gems. (example: require 'gtk3'=> 3 seconds+). This is a problem for people who want to make cross platform GUI apps with ruby. ## Possible Reason As touched on in [#15797](https://bugs.ruby-lang.org/issues/15797) it seems like require uses realpath, which is emulated on windows. It checks every parent directory. The same syscalls run many times. ## Testfile C:\tmp\speedtest\testrequire.rb: ``` ruby require __dir__ + "/helloworld1.rb" require __dir__ + "/helloworld2.rb" ``` ``` shell ruby --disable-gems C:\tmp\speedtest\testrequire.rb ``` ### Syscalls per File/Directory: 1. CreateFile 2. QueryInformationVolume 3. QueryIdInformation 4. QueryAllInformationFile 5. QueryNameInformationFile 6. QueryNameInformationFile 7. QueryNormalizedNameInformationFile 8. CloseFile ### Files/Directories checked 1. C:\tmp 2. C:\tmp\speedtest 3. C:\tmp\speedtest\helloworld1.rb 4. C:\tmp 5. C:\tmp\speedtest 6. C:\tmp\speedtest\helloworld2.rb For two required files Ruby had to do 8*6 = **48** syscalls. The syscalls orginate from rb_w32_reparse_symlink_p / lstat Rubygems live in subfolders with 9+ parts: "C:\Ruby32-x64\lib\ruby\gems\3.2.0\gems\glib2-4.0.8\lib\glib2\variant.rb" Each file takes 8 * 9 = **72**+ calls. For variant.rb it is **80** calls. The result for the syscalls don't change in such a short time, so it should be possible to cache it. With require_relative it's twice as many calls. ## Other testcases Same result: ``` ruby File.realpath __dir__ + "/helloworld1.rb" File.realpath __dir__ + "/helloworld2.rb" ``` ``` ruby File.stat __dir__ + "/helloworld1.rb" File.stat __dir__ + "/helloworld2.rb" ``` It does not happen in $LOAD_PATH.resolve_feature_path(__dir__ + "/helloworld1.rb") ## Request Would it be possible to cache the stat calls when using require? I tried to implement a cache inside the ruby source code, but failed. If not, is there now a way to combine ruby files into one? I previously talked about require here: [YJIT: Windows support lacking.](https://bugs.ruby-lang.org/issues/19325#note-11) ## How to reproduce Ruby versions: At least 3.0+, most likely older ones too. Tested using Ruby Installer 3.1 and 3.2. [Procmon Software by Sysinternals](https://learn.microsoft.com/en-us/sysinternals/downloads/procmon) ---Files-------------------------------- windows-no-realpath-require.patch (992 Bytes) windows-revert-79a4484a.patch (5.42 KB) -- https://bugs.ruby-lang.org/

Issue #19378 has been updated by MSP-Greg (Greg L). Just to be clear, this issue affects all Windows MRI platforms, so both mswin64 and mingw32 (mingw & ucrt builds) are affected. ---------------------------------------- Bug #19378: Windows: Use less syscalls for faster require of big gems https://bugs.ruby-lang.org/issues/19378#change-102097 * Author: aidog (Andi Idogawa) * Status: Assigned * Priority: Normal * Assignee: windows * ruby -v: 3.2.0 * Backport: 2.7: UNKNOWN, 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN ---------------------------------------- Hello 🙂 ## Problem require is slow on windows for big gems. (example: require 'gtk3'=> 3 seconds+). This is a problem for people who want to make cross platform GUI apps with ruby. ## Possible Reason As touched on in [#15797](https://bugs.ruby-lang.org/issues/15797) it seems like require uses realpath, which is emulated on windows. It checks every parent directory. The same syscalls run many times. ## Testfile C:\tmp\speedtest\testrequire.rb: ``` ruby require __dir__ + "/helloworld1.rb" require __dir__ + "/helloworld2.rb" ``` ``` shell ruby --disable-gems C:\tmp\speedtest\testrequire.rb ``` ### Syscalls per File/Directory: 1. CreateFile 2. QueryInformationVolume 3. QueryIdInformation 4. QueryAllInformationFile 5. QueryNameInformationFile 6. QueryNameInformationFile 7. QueryNormalizedNameInformationFile 8. CloseFile ### Files/Directories checked 1. C:\tmp 2. C:\tmp\speedtest 3. C:\tmp\speedtest\helloworld1.rb 4. C:\tmp 5. C:\tmp\speedtest 6. C:\tmp\speedtest\helloworld2.rb For two required files Ruby had to do 8*6 = **48** syscalls. The syscalls orginate from rb_w32_reparse_symlink_p / lstat Rubygems live in subfolders with 9+ parts: "C:\Ruby32-x64\lib\ruby\gems\3.2.0\gems\glib2-4.0.8\lib\glib2\variant.rb" Each file takes 8 * 9 = **72**+ calls. For variant.rb it is **80** calls. The result for the syscalls don't change in such a short time, so it should be possible to cache it. With require_relative it's twice as many calls. ## Other testcases Same result: ``` ruby File.realpath __dir__ + "/helloworld1.rb" File.realpath __dir__ + "/helloworld2.rb" ``` ``` ruby File.stat __dir__ + "/helloworld1.rb" File.stat __dir__ + "/helloworld2.rb" ``` It does not happen in $LOAD_PATH.resolve_feature_path(__dir__ + "/helloworld1.rb") ## Request Would it be possible to cache the stat calls when using require? I tried to implement a cache inside the ruby source code, but failed. If not, is there now a way to combine ruby files into one? I previously talked about require here: [YJIT: Windows support lacking.](https://bugs.ruby-lang.org/issues/19325#note-11) ## How to reproduce Ruby versions: At least 3.0+, most likely older ones too. Tested using Ruby Installer 3.1 and 3.2. [Procmon Software by Sysinternals](https://learn.microsoft.com/en-us/sysinternals/downloads/procmon) ---Files-------------------------------- windows-no-realpath-require.patch (992 Bytes) windows-revert-79a4484a.patch (5.42 KB) -- https://bugs.ruby-lang.org/
participants (5)
-
aidog (Andi Idogawa)
-
jeremyevans0 (Jeremy Evans)
-
joshc (Josh C)
-
MSP-Greg (Greg L)
-
nobu (Nobuyoshi Nakada)