March 2023 - ruby-core - ml.ruby-lang.org

[ruby-core:112399] [Ruby master Bug#19436] Call Cache for singleton methods can lead to "memory leaks"

by byroot (Jean Boussier)

Issue #19436 has been reported by byroot (Jean Boussier). ---------------------------------------- Bug #19436: Call Cache for singleton methods can lead to "memory leaks" https://bugs.ruby-lang.org/issues/19436 * Author: byroot (Jean Boussier) * Status: Open * Priority: Normal * Backport: 2.7: UNKNOWN, 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN ---------------------------------------- Using "memory leaks" with quotes, because strictly speaking the memory isn't leaked, but it can nonetheless lead to large memory overheads. ### Minimal Reproduction ```ruby module Foo def bar end end def call_bar(obj) # Here the call cache we'll keep a ref on the method_entry # which then keep a ref on the singleton_class, making that # instance immortal until the method is called again with # another instance. # The reference chain is IMEMO(callcache) -> IMEMO(ment) -> ICLASS -> CLASS(singleton) -> OBJECT obj.bar end obj = Object.new obj.extend(Foo) call_bar(obj) id = obj.object_id obj = nil 4.times { GC.start } p ObjectSpace._id2ref(id) ``` ### Explanation Call caches keep a strong reference onto the "callable method entry" (CME), which itself keeps a strong reference on the called object class and in the cache of a singleton class, it keeps a strong reference onto the `attached_object` (instance). This means that any call site that calls a singleton method, will effectively keep a strong reference onto the last receiver. If the method is frequently called it's not too bad, but if it's infrequently called, it's effectively a (bounded) memory leak. And if the `attached_object` is big, the wasted memory can be very substantial. ### Practical Implications Once relative common API impacted by this is [Rails' `extending` API](https://api.rubyonrails.org/classes/ActiveRecord/QueryMethods.html#met…. This API allow to extend a "query result set" with a module. These query results set can sometimes be very big, especially since they keep references to the instantiated `ActiveRecord::Base` instances etc. ### Possible Solutions #### Only keep a weak reference to the CME The fairly "obvious" solution is to keep a weak reference to the CME, that's what I explored in https://github.com/ruby/ruby/pull/7272, and it seems to work. However in debug mode It does fail on an assertion during compaction, but it's isn't quite clear to me what the impact is. Additionally, something that makes me think this would be the right solution, is that call caches already try to avoid marking the class: ```c # vm_callinfo.h:275 struct rb_callcache { const VALUE flags; /* inline cache: key */ const VALUE klass; // should not mark it because klass can not be free'd // because of this marking. When klass is collected, // cc will be cleared (cc->klass = 0) at vm_ccs_free(). ``` So it appears that the class being also marked through the CME is some kind of oversight? #### Don't cache based on some heuristics If the above isn't possible or too complicated, an alternative would be to not cache CMEs found in singleton classes, except if it's the the singleton class of a `Class` or `Module`. It would make repeated calls to such methods slower, but the assumption is that it's unlikely that these CME would live very long. #### Make `Class#attached_object` a weak reference Alternatively we could make the `attached_object` a weak reference, which would drastically limit the amount of memory that may be leaked in such scenario. The downside is that `Class#attached_object` was very recently exposed in Ruby 3.2.0, so it means changing its semantic a bit. cc @peterzhu2118 @ko1 -- https://bugs.ruby-lang.org/

8 months, 1 week

8
24
0 0

[ruby-core:111740] [Ruby master Bug#19325] Windows support lacking.

by dsisnero (Dominic Sisneros)

Issue #19325 has been reported by dsisnero (Dominic Sisneros). ---------------------------------------- Bug #19325: Windows support lacking. https://bugs.ruby-lang.org/issues/19325 * Author: dsisnero (Dominic Sisneros) * Status: Open * Priority: Normal * Backport: 2.7: UNKNOWN, 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN ---------------------------------------- Ruby's support on windows has always been second class. With some of the recent decisions, windows support is falling even more behind. Recent developments in mjit and yjit that exclude windows are two glaring issues that should be corrected. Googling 'percent of windows vs other operating systems' and it shows windows has a share of 76%. Ceding that users to python and other programming languages has to be one of the reasons python continues get more market share from ruby. With rust having first class windows support and threading support, is there a reason why yjit is not able to work on windows? Also, windows compiler support has matured enough and vcpkg support has evolved enough that it seems it should be possible to finally get a ruby version without having to use msys2. Even Crystal language has a version that runs on windows without needing msys2. -- https://bugs.ruby-lang.org/

8 months, 3 weeks

9
10
0 0

[ruby-core:112948] [Ruby master Bug#19543] Resizing IO::Buffer to zero bytes fails

by hanazuki (Kasumi Hanazuki)

Issue #19543 has been reported by hanazuki (Kasumi Hanazuki). ---------------------------------------- Bug #19543: Resizing IO::Buffer to zero bytes fails https://bugs.ruby-lang.org/issues/19543 * Author: hanazuki (Kasumi Hanazuki) * Status: Open * Priority: Normal * ruby -v: ruby 3.3.0dev (2023-03-20T04:02:21Z master 7f696b8859) [x86_64-linux] * Backport: 2.7: UNKNOWN, 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN ---------------------------------------- ``` irb(main):001:0> IO::Buffer.new(1).resize(0) /home/kasumi/.local/src/github.com/ruby/ruby/-e:1: warning: IO::Buffer is experimental and both the Ruby and C interface may change in the future! /home/kasumi/.local/src/github.com/ruby/ruby/-e:1: [BUG] rb_sys_fail(rb_io_buffer_resize:realloc) - errno == 0 ruby 3.3.0dev (2023-03-20T04:02:21Z master 7f696b8859) [x86_64-linux] # full trace is attached to this ticket ``` `IO::Buffer#resize(0)` will result in calling `realloc(data->base, size)` with size = 0 in [rb_io_buffer_resize](https://bugs.ruby-lang.org/projects/ruby-master/reposi…. Zero-sized `realloc` is deprecated in C (and will be UB in C23). ---Files-------------------------------- bug.txt (40 KB) -- https://bugs.ruby-lang.org/

9 months

4
5
0 0

[ruby-core:111698] [Ruby master Bug#19318] Float#round rounds incorrectly for some cases

by Eregon (Benoit Daloze)

Issue #19318 has been reported by Eregon (Benoit Daloze). ---------------------------------------- Bug #19318: Float#round rounds incorrectly for some cases https://bugs.ruby-lang.org/issues/19318 * Author: Eregon (Benoit Daloze) * Status: Open * Priority: Normal * Backport: 2.7: UNKNOWN, 3.0: REQUIRED, 3.1: REQUIRED, 3.2: REQUIRED ---------------------------------------- This was discovered by @aardvark179. The following spec in `spec/ruby/core/float/round_spec.rb` fails on CRuby: ```ruby ruby_bug "", ""..."3.3" do # These numbers are neighbouring floating point numbers round a # precise value. They test that the rounding modes work correctly # round that value and precision is not lost which might cause # incorrect results. it "does not lose precision during the rounding process" do 767573.1875850001.round(5, half: nil).should eql(767573.18759) 767573.1875850001.round(5, half: :up).should eql(767573.18759) 767573.1875850001.round(5, half: :down).should eql(767573.18759) 767573.1875850001.round(5, half: :even).should eql(767573.18759) -767573.1875850001.round(5, half: nil).should eql(-767573.18759) -767573.1875850001.round(5, half: :up).should eql(-767573.18759) -767573.1875850001.round(5, half: :down).should eql(-767573.18759) -767573.1875850001.round(5, half: :even).should eql(-767573.18759) 767573.187585.round(5, half: nil).should eql(767573.18759) 767573.187585.round(5, half: :up).should eql(767573.18759) 767573.187585.round(5, half: :down).should eql(767573.18758) 767573.187585.round(5, half: :even).should eql(767573.18758) -767573.187585.round(5, half: nil).should eql(-767573.18759) -767573.187585.round(5, half: :up).should eql(-767573.18759) -767573.187585.round(5, half: :down).should eql(-767573.18758) -767573.187585.round(5, half: :even).should eql(-767573.18758) 767573.1875849998.round(5, half: nil).should eql(767573.18758) 767573.1875849998.round(5, half: :up).should eql(767573.18758) 767573.1875849998.round(5, half: :down).should eql(767573.18758) 767573.1875849998.round(5, half: :even).should eql(767573.18758) -767573.1875849998.round(5, half: nil).should eql(-767573.18758) -767573.1875849998.round(5, half: :up).should eql(-767573.18758) -767573.1875849998.round(5, half: :down).should eql(-767573.18758) -767573.1875849998.round(5, half: :even).should eql(-767573.18758) end end ``` Yet this test to the best of our knowledge is correct. This was fixed on master by @mrkn in https://github.com/ruby/ruby/pull/7023 (thanks!). The question is should we backport this? I think yes. -- https://bugs.ruby-lang.org/

9 months

3
2
0 0

[ruby-core:112146] [Ruby master Bug#19394] cvars in instance of cloned class point to source class's cvars even after class_variable_set on clone

by jamescdavis (James Davis)

Issue #19394 has been reported by jamescdavis (James Davis). ---------------------------------------- Bug #19394: cvars in instance of cloned class point to source class's cvars even after class_variable_set on clone https://bugs.ruby-lang.org/issues/19394 * Author: jamescdavis (James Davis) * Status: Open * Priority: Normal * ruby -v: ruby 3.1.3p185 (2022-11-24 revision 1a6b16756e) [x86_64-darwin21] * Backport: 2.7: UNKNOWN, 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN ---------------------------------------- This unexpected change in behavior happens between Ruby 3.0.x and 3.1.x. In Ruby >= 3.1, when a class with a cvar is cloned (or duped), the cvar in instances of the cloned class continues to point to the source class’s cvar after the clone has its cvar updated with `class_variable_set`. In Ruby < 3.1, the cloned class instance points to the updated cvar, as expected. It seems likely that this is a bug in the [cvar cache](https://bugs.ruby-lang.org/issues/17763) introduced in Ruby 3.1. Repro: ```rb class Foo @@bar = 'bar' def print_bar puts "#{self.class.name} (from instance): #{@@bar} #{@@bar.object_id}" end end foo_bar = Foo.class_variable_get(:@@bar) puts "Foo (class_variable_get): #{foo_bar} #{foo_bar.object_id}" Foo.new.print_bar FooClone = Foo.clone FooClone.class_variable_set(:@@bar, 'bar_clone') foo_clone_bar = FooClone.class_variable_get(:@@bar) puts "FooClone (class_variable_get): #{foo_clone_bar} #{foo_clone_bar.object_id}" FooClone.new.print_bar ``` Ruby 3.0.5: ``` Foo (class_variable_get): bar 60 Foo (from instance): bar 60 FooClone (class_variable_get): bar_clone 80 FooClone (from instance): bar_clone 80 ``` Ruby 3.1.3, 3.2.0: ``` Foo (class_variable_get): bar 60 Foo (from instance): bar 60 FooClone (class_variable_get): bar_clone 80 FooClone (from instance): bar 60 ``` Something similar happens when there are multiple clones and a cvar that the source class does not have defined is set on the clones. In this case, the cvars in instances of the clones all point to the first clone’s cvar. Repro: ```rb class Foo def print_bar puts "#{self.class.name} (from instance): #{@@bar} #{@@bar.object_id}" end end Foo1 = Foo.clone Foo2 = Foo.clone Foo3 = Foo.clone Foo1.class_variable_set(:@@bar, 'bar1') Foo2.class_variable_set(:@@bar, 'bar2') Foo3.class_variable_set(:@@bar, 'bar3') foo1_bar = Foo1.class_variable_get(:@@bar) foo2_bar = Foo2.class_variable_get(:@@bar) foo3_bar = Foo3.class_variable_get(:@@bar) puts "Foo1 (class_variable_get): #{foo1_bar} #{foo1_bar.object_id}" puts "Foo2 (class_variable_get): #{foo2_bar} #{foo2_bar.object_id}" puts "Foo3 (class_variable_get): #{foo3_bar} #{foo3_bar.object_id}" Foo1.new.print_bar Foo2.new.print_bar Foo3.new.print_bar ``` Ruby 3.0.5: ``` Foo1 (class_variable_get): bar1 60 Foo2 (class_variable_get): bar2 80 Foo3 (class_variable_get): bar3 100 Foo1 (from instance): bar1 60 Foo2 (from instance): bar2 80 Foo3 (from instance): bar3 100 ``` Ruby 3.1.3, 3.2.0: ``` Foo1 (class_variable_get): bar1 60 Foo2 (class_variable_get): bar2 80 Foo3 (class_variable_get): bar3 100 Foo1 (from instance): bar1 60 Foo2 (from instance): bar1 60 Foo3 (from instance): bar1 60 ``` -- https://bugs.ruby-lang.org/

9 months

5
7
0 0

[ruby-core:111565] [Ruby master Bug#19293] The new Time.new(String) API is nice... but we still need a stricter version of this

by matsuda (Akira Matsuda)

Issue #19293 has been reported by matsuda (Akira Matsuda). ---------------------------------------- Bug #19293: The new Time.new(String) API is nice... but we still need a stricter version of this https://bugs.ruby-lang.org/issues/19293 * Author: matsuda (Akira Matsuda) * Status: Open * Priority: Normal * ruby -v: ruby 3.3.0dev (2023-01-01T07:39:00Z master 542e984d82) +YJIT [arm64-darwin21] * Backport: 2.7: UNKNOWN, 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN ---------------------------------------- The Ruby 3.2 style `Time.new(String)` API works very well so far, but since the original `Time.new(Integer, Integer, Integer...)` API actually accepts String objects as its arguments, there's one ambiguous case as follows: `Time.new('20230123') #=> 20230123-01-01 00:00:00 +0900` Then the problem that I'm facing is that we cannot tell if `Time.new` would parse the given String as ISO8601-ish or just a year, and in order to avoid this ambiguity, we still need to somehow parse the String beforehand in our application side (like we're doing this way in Ruby on Rails https://github.com/rails/rails/blob/c49b8270/activemodel/lib/active_model/t…), then dispatch to the new `Time.new` only when the String is validated to be conforming the ISO format. Otherwise, if we just optimistically pass in given Strings to `Time.new`, we'll occasionally get a Time object with an unintended buggy value. Therefore, it unfortunately seems that my feature request on #16005 still continues... I have to keep proposing that we need either of the following: 1. A trustworthy version of ISO8601 parser method perhaps with another name than `.new` that accepts strict ISO8601-ish String only (but with the T delimiter, I still don't know what the proper name of this format is). 2. Change `Time.new(Integer-ish, Integer-ish, Integer-ish...)` not to accept Integer-ish Strings but to accept only Integers. But I can imagine that this direction is very unlikely acceptable, due to the incompatibility. -- https://bugs.ruby-lang.org/

9 months, 1 week

7
7
0 0

[ruby-core:112918] [Ruby master Bug#19532] Handling of 6-byte codepoints in left_adjust_char_head in CESU-8 encoding is broken

by Eregon (Benoit Daloze)

Issue #19532 has been reported by Eregon (Benoit Daloze). ---------------------------------------- Bug #19532: Handling of 6-byte codepoints in left_adjust_char_head in CESU-8 encoding is broken https://bugs.ruby-lang.org/issues/19532 * Author: Eregon (Benoit Daloze) * Status: Open * Priority: Normal * Backport: 2.7: UNKNOWN, 3.0: REQUIRED, 3.1: REQUIRED, 3.2: REQUIRED ---------------------------------------- Fix in https://github.com/ruby/ruby/pull/7510 -- https://bugs.ruby-lang.org/

9 months, 1 week

4
3
0 0

[ruby-core:112926] [Ruby master Misc#19535] Instance variables order is unpredictable on objects with `OBJ_TOO_COMPLEX_SHAPE_ID`

by byroot (Jean Boussier)

Issue #19535 has been reported by byroot (Jean Boussier). ---------------------------------------- Misc #19535: Instance variables order is unpredictable on objects with `OBJ_TOO_COMPLEX_SHAPE_ID` https://bugs.ruby-lang.org/issues/19535 * Author: byroot (Jean Boussier) * Status: Open * Priority: Normal ---------------------------------------- ### Context I've been helping the Mastodon folks in investigating a weird Marshal deserialization bug they randomly experience since they upgraded to Ruby 3.2: https://github.com/mastodon/mastodon/issues/23644 Ultimately the bug comes from a circular dependency issues in the object graph that is serialized when one call `Marshal.dump` on an `ActiveRecord::Base` object. A simplified reproduction to better explain the problem is: ```ruby class Status def normal_order @attributes = { id: 42 } @relations = { self => 1 } self end def inverse_order @relations = nil @attributes = { id: 42 } @relations = { self => 1 } self end def hash @attributes.fetch(:id) end end s = Marshal.load(Marshal.dump(Status.new.normal_order)) s = Marshal.load(Marshal.dump(Status.new.inverse_order)) ``` In short, that `Status` object is both the top level object, and is referenced as a key in a hash, in that same payload. It also defined a custom `#hash` method, that requires some other attribute to be set. It all "works" as long as `@attributes` is dumped before `@relations`. ### Problem The above micro-reproduction uses two different shapes to demonstrate the ordering issues, but in both case the ordering is predictable. However if you generate too many shapes from a single class, it will be marked as `TOO_COMPLEX` and future instance will have their instance variables backed by an `id_table`, which is unordered, and will cause a similar issue. I definitely consider this a bug on the Rails side, and I will do what I can so that Rails doesn't depend on that implicit ordering. However it's unlikely we'll be able to fix older version, and other users may run into this issue when upgrading to Ruby 3.2, so I think it may be worth to try to preserve some sort of predicable ordering, at least for a few more versions. Additionally, debugging it was made particularly difficult, because it would work fine initially, and then break after enough shapes had been generated. Generally speaking I think such semi-predictable behavior is much worse than a fully random behavior (similar to how Go randomize keys order in their maps). ### Historical behavior On Ruby 3.1 and older, the instance variables ordering was defined by the order in which each ivar appeared for the very first time: ```ruby class Foo def set @a = 1 @b = 2 @c = 3 self end def inverse_order @c = 3 @b = 2 @a = 1 self end end p Foo.new.set.instance_variables # => [:@a, :@b, :@c] p Foo.new.inverse_order.instance_variables # => [:@a, :@b, :@c] ``` This means that the order could be different from once execution of the program to another, but would remain stable inside a single process. On 3.2, it's now defined by the order in which each ivar appeared in that specific object instance: ```ruby [:@a, :@b, :@c] [:@c, :@b, :@a] ``` Except, if the object is backed by an `id_table`, in which case it's fully unpredictable. ### Possible changes I discussed this with @tenderlovemaking, and he suggested we could change the `id_table` for an `st_table` so that the ordering could be predictable again, and would behave like objects with a non-complex shape. Another possibility would be to preserve the observable behavior of 3.1 and older. Or of course we could clearly specify that the ordering is random, but if so I think it would be wise to make it always random so that this class of bugs has a much higher chance to be caught early in testing rather than in production. cc @Eregon as I presume this has implications on TruffleRuby as well. -- https://bugs.ruby-lang.org/

9 months, 1 week

5
7
0 0

[ruby-core:112906] [Ruby master Bug#19531] ObjectSpace::WeakMap: replaced values still clear the key they were assigned to

by byroot (Jean Boussier)

Issue #19531 has been reported by byroot (Jean Boussier). ---------------------------------------- Bug #19531: ObjectSpace::WeakMap: replaced values still clear the key they were assigned to https://bugs.ruby-lang.org/issues/19531 * Author: byroot (Jean Boussier) * Status: Open * Priority: Normal * Backport: 2.7: WONTFIX, 3.0: REQUIRED, 3.1: REQUIRED, 3.2: REQUIRED ---------------------------------------- ### Reproduction script ```ruby wmap = ObjectSpace::WeakMap.new a = "A" b = "B" wmap[1] = a wmap[1] = b # the table entry with 1 is still in the list of entries to clear when `a` is GCed a = nil GC.start p wmap[1] # Should be `"B"`, but is `nil` ``` ### Explanation What happens is that when we set `wmap[1] = "A"`, WeakMap internally keeps a list of keys to clear when `"A"` is GCed, e.g. pseudo code: ```ruby class WeakMap def []=(key, value) @hash[key] = value @reverse[value] << key end end ``` But it doesn't clear previously kept mapping when a key is overwritten. I'll work on a fix. ### References https://github.com/protocolbuffers/protobuf/pull/12216 -- https://bugs.ruby-lang.org/

9 months, 1 week

2
2
0 0

[ruby-core:111873] [Ruby master Bug#19351] Promote bundled gems at Ruby 3.3

by hsbt (Hiroshi SHIBATA)

Issue #19351 has been reported by hsbt (Hiroshi SHIBATA). ---------------------------------------- Bug #19351: Promote bundled gems at Ruby 3.3 https://bugs.ruby-lang.org/issues/19351 * Author: hsbt (Hiroshi SHIBATA) * Status: Open * Priority: Normal * Backport: 2.7: UNKNOWN, 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN ---------------------------------------- In Ruby 3.2, the default gems and bundled gems are changed only adding `syntax_suggest`. I and some committers are considering promote default gems to bundled gems again for Ruby 3.3+. We hope to keep the current developer experience with dependency resolution and ignore the additional work like "Put gem "xxx" into your Gemfile" for developers. ### Proposal We propose the following libraries will promote default gems to bundled gems at Ruby 3.3. They are not the dependencies of Rails and RubyGems/Bundler. ``` abbrev getoptlong optparse observable resolv resolv-replace rinda un fcntl nkf syslog win32ole ``` ### Additional works I also propose to poromote rails dependencies: ``` ostruct base64 irb rdoc tsort singleton delegate ``` and gems maintained by @kou ``` csv strscan fiddle stringio ``` But if we promote them to bundled gems, many of users need to add `gem "csv"` into their Gemfile. I'm considering to avoid this situation. Can we the specific feature of bundled gems to RubyGems or Bundler? Example, bundler have allowed list for bundled gems. So, listed gems could be require without Gemfile under the bundle exec. -- https://bugs.ruby-lang.org/

9 months, 1 week

6
25
0 0

2024

2023

2022

ruby-core March 2023

2024

2023

2022

ruby-core March 2023 ----- 2024 ----- April 2024 March 2024 February 2024 January 2024 ----- 2023 ----- December 2023 November 2023 October 2023 September 2023 August 2023 July 2023 June 2023 May 2023 April 2023 March 2023 February 2023 January 2023 ----- 2022 ----- December 2022 November 2022

ruby-core March 2023