[ruby-core:115139] [Ruby master Bug#19969] Regression of memory usage with Ruby 3.1

Issue #19969 has been reported by hsbt (Hiroshi SHIBATA). ---------------------------------------- Bug #19969: Regression of memory usage with Ruby 3.1 https://bugs.ruby-lang.org/issues/19969 * Author: hsbt (Hiroshi SHIBATA) * Status: Open * Priority: Normal * Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN ---------------------------------------- Our company that is ANDPAD, Inc. encountered to increase memory usage after upgrading Ruby 3.2 from 3.0 on our Rails application. This increase size is about 20%. My colleague found this [root cause](https://bugs.ruby-lang.org/issues/16996) and reproduction code: ``` $ ruby -v -rset -e 's1 = Set.new(10000.times); s2 = Set.new(9999.times); Array.new(10000) { s1 - s2 - [0] }; puts `ps -o rss= -p #{$$}`.to_i' ruby 3.0.6p216 (2023-06-29 revision bdfe1958a8) +JIT [arm64-darwin22] 248096 $ ruby -v -rset -e 's1 = Set.new(10000.times); s2 = Set.new(9999.times); Array.new(10000) { s1 - s2 - [0] }; puts `ps -o rss= -p #{$$}`.to_i' ruby 3.2.2 (2023-07-05 revision 2f603bc4d7) +YJIT [arm64-darwin22] 2949280 ``` Should we revert #16996 for Ruby 3.1 or later? I'm not sure this increased memory usage is reasonable with performance improvement. -- https://bugs.ruby-lang.org/

Issue #19969 has been updated by nobu (Nobuyoshi Nakada). May https://github.com/nobu/ruby/tree/rehash-after-delete help it? ---------------------------------------- Bug #19969: Regression of memory usage with Ruby 3.1 https://bugs.ruby-lang.org/issues/19969#change-105048 * Author: hsbt (Hiroshi SHIBATA) * Status: Open * Priority: Normal * Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN ---------------------------------------- Our company that is ANDPAD, Inc. encountered to increase memory usage after upgrading Ruby 3.2 from 3.0 on our Rails application. This increase size is about 20%. My colleague found this [root cause](https://bugs.ruby-lang.org/issues/16996) and reproduction code: ``` $ ruby -v -rset -e 's1 = Set.new(10000.times); s2 = Set.new(9999.times); Array.new(10000) { s1 - s2 - [0] }; puts `ps -o rss= -p #{$$}`.to_i' ruby 3.0.6p216 (2023-06-29 revision bdfe1958a8) +JIT [arm64-darwin22] 248096 $ ruby -v -rset -e 's1 = Set.new(10000.times); s2 = Set.new(9999.times); Array.new(10000) { s1 - s2 - [0] }; puts `ps -o rss= -p #{$$}`.to_i' ruby 3.2.2 (2023-07-05 revision 2f603bc4d7) +YJIT [arm64-darwin22] 2949280 ``` Should we revert #16996 for Ruby 3.1 or later? I'm not sure this increased memory usage is reasonable with performance improvement. -- https://bugs.ruby-lang.org/

Issue #19969 has been updated by Eregon (Benoit Daloze). Right, @nobu's approach seems much better than reintroducing that weird behavior for `.dup`. Ideally we wouldn't rehash as in calling `key.hash` methods again, but instead just shrink the internal data structure (and same when growing it). ---------------------------------------- Bug #19969: Regression of memory usage with Ruby 3.1 https://bugs.ruby-lang.org/issues/19969#change-105059 * Author: hsbt (Hiroshi SHIBATA) * Status: Open * Priority: Normal * Backport: 3.0: DONTNEED, 3.1: REQUIRED, 3.2: REQUIRED ---------------------------------------- Our company that is ANDPAD, Inc. encountered to increase memory usage after upgrading Ruby 3.2 from 3.0 on our Rails application. This increase size is about 20%. My colleague found this [root cause](https://bugs.ruby-lang.org/issues/16996) and reproduction code: ``` $ ruby -v -rset -e 's1 = Set.new(10000.times); s2 = Set.new(9999.times); Array.new(10000) { s1 - s2 - [0] }; puts `ps -o rss= -p #{$$}`.to_i' ruby 3.0.6p216 (2023-06-29 revision bdfe1958a8) +JIT [arm64-darwin22] 248096 $ ruby -v -rset -e 's1 = Set.new(10000.times); s2 = Set.new(9999.times); Array.new(10000) { s1 - s2 - [0] }; puts `ps -o rss= -p #{$$}`.to_i' ruby 3.2.2 (2023-07-05 revision 2f603bc4d7) +YJIT [arm64-darwin22] 2949280 ``` Should we revert #16996 for Ruby 3.1 or later? I'm not sure this increased memory usage is reasonable with performance improvement. -- https://bugs.ruby-lang.org/

Issue #19969 has been updated by Eregon (Benoit Daloze). So apparently some applications were relying on `Set#dup`/`Hash#dup` to do like C++ [shrink_to_fit](https://en.cppreference.com/w/cpp/container/vector/shrink_to_fit). Ruby does not have such a method and it feels quite low-level, so it seems better to resize the internal data structure when removing elements/entries and going below some threshold. ---------------------------------------- Bug #19969: Regression of memory usage with Ruby 3.1 https://bugs.ruby-lang.org/issues/19969#change-105060 * Author: hsbt (Hiroshi SHIBATA) * Status: Open * Priority: Normal * Backport: 3.0: DONTNEED, 3.1: REQUIRED, 3.2: REQUIRED ---------------------------------------- Our company that is ANDPAD, Inc. encountered to increase memory usage after upgrading Ruby 3.2 from 3.0 on our Rails application. This increase size is about 20%. My colleague found this [root cause](https://bugs.ruby-lang.org/issues/16996) and reproduction code: ``` $ ruby -v -rset -e 's1 = Set.new(10000.times); s2 = Set.new(9999.times); Array.new(10000) { s1 - s2 - [0] }; puts `ps -o rss= -p #{$$}`.to_i' ruby 3.0.6p216 (2023-06-29 revision bdfe1958a8) +JIT [arm64-darwin22] 248096 $ ruby -v -rset -e 's1 = Set.new(10000.times); s2 = Set.new(9999.times); Array.new(10000) { s1 - s2 - [0] }; puts `ps -o rss= -p #{$$}`.to_i' ruby 3.2.2 (2023-07-05 revision 2f603bc4d7) +YJIT [arm64-darwin22] 2949280 ``` Should we revert #16996 for Ruby 3.1 or later? I'm not sure this increased memory usage is reasonable with performance improvement. -- https://bugs.ruby-lang.org/

Issue #19969 has been updated by Eregon (Benoit Daloze). As a note, this repro code is very "lucky" to trigger a `dup` after removing 99.99% of the elements. I suppose it's done that way to make the effect very clear though. Without the `- [0]` the same problem occurs on 3.0: ``` $ ruby -v -rset -e 's1 = Set.new(10000.times); s2 = Set.new(9999.times); a=Array.new(10000) { s1 - s2 }; GC.start; puts `ps -o rss= -p #{$$}`.to_i' ruby 3.0.6p216 (2023-03-30 revision 23a532679b) [x86_64-linux] 3015808 $ ruby -v -rset -e 's1 = Set.new(10000.times); s2 = Set.new(9999.times); a=Array.new(10000) { s1 - s2 - [0] }; GC.start; puts `ps -o rss= -p #{$$}`.to_i' ruby 3.0.6p216 (2023-03-30 revision 23a532679b) [x86_64-linux] 74552 ``` If a Set is kept alive a long time, one way to ensure it uses the minimum amount of space is `Set#reset`, at the cost of extra time to reset/rehash (which notably calls `#hash` for every key), it's a time vs memory trade-off, can be worth it for big long-lived sets: ``` $ ruby -v -rset -e 's1 = Set.new(10000.times); s2 = Set.new(9999.times); a=Array.new(10000) { s=s1 - s2 - [0]; s.reset; s }; GC.start; puts `ps -o rss= -p #{$$}`.to_i' ruby 3.2.2 (2023-03-30 revision e51014f9c0) [x86_64-linux] 62992 ``` Automatic shrinking (PR at https://github.com/ruby/ruby/pull/8748) should help the worst cases like the repro so that seems good anyway. ---------------------------------------- Bug #19969: Regression of memory usage with Ruby 3.1 https://bugs.ruby-lang.org/issues/19969#change-105224 * Author: hsbt (Hiroshi SHIBATA) * Status: Open * Priority: Normal * Backport: 3.0: DONTNEED, 3.1: REQUIRED, 3.2: REQUIRED ---------------------------------------- Our company that is ANDPAD, Inc. encountered to increase memory usage after upgrading Ruby 3.2 from 3.0 on our Rails application. This increase size is about 20%. My colleague found this [root cause](https://bugs.ruby-lang.org/issues/16996) and reproduction code: ``` $ ruby -v -rset -e 's1 = Set.new(10000.times); s2 = Set.new(9999.times); Array.new(10000) { s1 - s2 - [0] }; puts `ps -o rss= -p #{$$}`.to_i' ruby 3.0.6p216 (2023-06-29 revision bdfe1958a8) +JIT [arm64-darwin22] 248096 $ ruby -v -rset -e 's1 = Set.new(10000.times); s2 = Set.new(9999.times); Array.new(10000) { s1 - s2 - [0] }; puts `ps -o rss= -p #{$$}`.to_i' ruby 3.2.2 (2023-07-05 revision 2f603bc4d7) +YJIT [arm64-darwin22] 2949280 ``` Should we revert #16996 for Ruby 3.1 or later? I'm not sure this increased memory usage is reasonable with performance improvement. -- https://bugs.ruby-lang.org/

Issue #19969 has been updated by nagachika (Tomoyuki Chikanaga). Backport changed from 3.0: DONTNEED, 3.1: REQUIRED, 3.2: REQUIRED to 3.0: DONTNEED, 3.1: REQUIRED, 3.2: DONE ruby_3_2 1cc38d5a2f84733e1c2e42548639e2891fe61e69 merged revision(s) 9eac9d71786a8dbec520d0541a91149f01adf8ea. ---------------------------------------- Bug #19969: Regression of memory usage with Ruby 3.1 https://bugs.ruby-lang.org/issues/19969#change-105353 * Author: hsbt (Hiroshi SHIBATA) * Status: Closed * Priority: Normal * Backport: 3.0: DONTNEED, 3.1: REQUIRED, 3.2: DONE ---------------------------------------- Our company that is ANDPAD, Inc. encountered to increase memory usage after upgrading Ruby 3.2 from 3.0 on our Rails application. This increase size is about 20%. My colleague found this [root cause](https://bugs.ruby-lang.org/issues/16996) and reproduction code: ``` $ ruby -v -rset -e 's1 = Set.new(10000.times); s2 = Set.new(9999.times); Array.new(10000) { s1 - s2 - [0] }; puts `ps -o rss= -p #{$$}`.to_i' ruby 3.0.6p216 (2023-06-29 revision bdfe1958a8) +JIT [arm64-darwin22] 248096 $ ruby -v -rset -e 's1 = Set.new(10000.times); s2 = Set.new(9999.times); Array.new(10000) { s1 - s2 - [0] }; puts `ps -o rss= -p #{$$}`.to_i' ruby 3.2.2 (2023-07-05 revision 2f603bc4d7) +YJIT [arm64-darwin22] 2949280 ``` Should we revert #16996 for Ruby 3.1 or later? I'm not sure this increased memory usage is reasonable with performance improvement. -- https://bugs.ruby-lang.org/

Issue #19969 has been updated by hsbt (Hiroshi SHIBATA). Thanks nobu and nagachika. I confirmed to resolve this regrassion with `ruby_3_2` branch. ``` # Before $ ruby -v -rset -e 's1 = Set.new(10000.times); s2 = Set.new(9999.times); Array.new(10000) { s1 - s2 - [0] }; puts `ps -o rss= -p #{$$}`.to_i' ruby 3.2.2 (2023-07-05 revision 2f603bc4d7) +YJIT [arm64-darwin23] 4564304 # After $ ruby -v -rset -e 's1 = Set.new(10000.times); s2 = Set.new(9999.times); Array.new(10000) { s1 - s2 - [0] }; puts `ps -o rss= -p #{$$}`.to_i' ruby 3.2.2 (2023-11-19 revision d9f4f321c6) +YJIT [arm64-darwin23] 40864 ``` ---------------------------------------- Bug #19969: Regression of memory usage with Ruby 3.1 https://bugs.ruby-lang.org/issues/19969#change-105354 * Author: hsbt (Hiroshi SHIBATA) * Status: Closed * Priority: Normal * Backport: 3.0: DONTNEED, 3.1: REQUIRED, 3.2: DONE ---------------------------------------- Our company that is ANDPAD, Inc. encountered to increase memory usage after upgrading Ruby 3.2 from 3.0 on our Rails application. This increase size is about 20%. My colleague found this [root cause](https://bugs.ruby-lang.org/issues/16996) and reproduction code: ``` $ ruby -v -rset -e 's1 = Set.new(10000.times); s2 = Set.new(9999.times); Array.new(10000) { s1 - s2 - [0] }; puts `ps -o rss= -p #{$$}`.to_i' ruby 3.0.6p216 (2023-06-29 revision bdfe1958a8) +JIT [arm64-darwin22] 248096 $ ruby -v -rset -e 's1 = Set.new(10000.times); s2 = Set.new(9999.times); Array.new(10000) { s1 - s2 - [0] }; puts `ps -o rss= -p #{$$}`.to_i' ruby 3.2.2 (2023-07-05 revision 2f603bc4d7) +YJIT [arm64-darwin22] 2949280 ``` Should we revert #16996 for Ruby 3.1 or later? I'm not sure this increased memory usage is reasonable with performance improvement. -- https://bugs.ruby-lang.org/

Issue #19969 has been updated by usa (Usaku NAKAMURA). Backport changed from 3.0: DONTNEED, 3.1: REQUIRED, 3.2: DONE to 3.0: DONTNEED, 3.1: DONE, 3.2: DONE ruby_3_1 1cae5e7ceaca7304108fdec35d4858a9e4ff7fe0 merged revision(s) 9eac9d71786a8dbec520d0541a91149f01adf8ea. ---------------------------------------- Bug #19969: Regression of memory usage with Ruby 3.1 https://bugs.ruby-lang.org/issues/19969#change-105355 * Author: hsbt (Hiroshi SHIBATA) * Status: Closed * Priority: Normal * Backport: 3.0: DONTNEED, 3.1: DONE, 3.2: DONE ---------------------------------------- Our company that is ANDPAD, Inc. encountered to increase memory usage after upgrading Ruby 3.2 from 3.0 on our Rails application. This increase size is about 20%. My colleague found this [root cause](https://bugs.ruby-lang.org/issues/16996) and reproduction code: ``` $ ruby -v -rset -e 's1 = Set.new(10000.times); s2 = Set.new(9999.times); Array.new(10000) { s1 - s2 - [0] }; puts `ps -o rss= -p #{$$}`.to_i' ruby 3.0.6p216 (2023-06-29 revision bdfe1958a8) +JIT [arm64-darwin22] 248096 $ ruby -v -rset -e 's1 = Set.new(10000.times); s2 = Set.new(9999.times); Array.new(10000) { s1 - s2 - [0] }; puts `ps -o rss= -p #{$$}`.to_i' ruby 3.2.2 (2023-07-05 revision 2f603bc4d7) +YJIT [arm64-darwin22] 2949280 ``` Should we revert #16996 for Ruby 3.1 or later? I'm not sure this increased memory usage is reasonable with performance improvement. -- https://bugs.ruby-lang.org/
participants (5)
-
Eregon (Benoit Daloze)
-
hsbt (Hiroshi SHIBATA)
-
nagachika (Tomoyuki Chikanaga)
-
nobu (Nobuyoshi Nakada)
-
usa (Usaku NAKAMURA)