[ruby-core:121659] [Ruby Bug#21267] respond_to check in Class#allocate is inconsistent

Issue #21267 has been reported by jhawthorn (John Hawthorn). ---------------------------------------- Bug #21267: respond_to check in Class#allocate is inconsistent https://bugs.ruby-lang.org/issues/21267 * Author: jhawthorn (John Hawthorn) * Status: Open * Backport: 3.2: UNKNOWN, 3.3: UNKNOWN, 3.4: UNKNOWN ---------------------------------------- `Class#allocate` has an additional `rb_obj_respond_to(klass, rb_intern("allocate"))` check to forbid allocate being called on a class where it has been made private or undefined, even if used via ex. `bind_call`. ```
Rational.allocate (irb):1:in '<main>': undefined method 'allocate' for class Rational (NoMethodError) Class.instance_method(:allocate).bind_call(Rational) (irb):1:in 'Class#allocate': calling Rational.allocate is prohibited (TypeError)
However I don't think this provides any additional protection from users accessing an uninitialized object, as the user can redefine allocate to anything to bypass the check:
Class.instance_method(:allocate).bind_call(Class.new(Rational) {def self.allocate; end}) => (0/1)
Or even override `respond_to_missing?`
Class.instance_method(:allocate).bind_call(Class.new(Rational) {def self.respond_to_missing? *; true; end}) => (0/1)
So I think we should remove this check. For classes that we need to forbid allocation we should use `rb_undef_alloc_func`.
The classes I see this used for are:
* MatchData
* Refinement
* Module
* Complex
* Rational
My main motivation is that this check makes `Class#allocate` slow. There are ways we could improve that, but I don't think the check as-is is useful. If there's an alternative, more robust, check we'd like to do instead of simply removing this I'd be happy to implement it.
This makes `allocate` ~75% faster.
|allocate_no_params | 19.009M| 20.087M| | | -| 1.06x| |allocate_allocate | 20.587M| 35.882M| | | -| 1.74x| ``` https://github.com/ruby/ruby/pull/13116 -- https://bugs.ruby-lang.org/

Issue #21267 has been updated by jhawthorn (John Hawthorn). At RubyKaigi @byroot and I looked at this and came up with a more efficient and resilient way to do the check (https://github.com/ruby/ruby/commit/a6365cef1084191f992ee55fa661cb9aa2196c39). But coming back to this I again don't think it's effective enough to prevent uninitialized values and we should just remove the check. Again, this check is trying to avoid users being able to make an uninitialized value of these classes, even if they are trying very hard ``` ruby
Class.instance_method(:allocate).bind_call(MatchData) (irb):6:in 'Class#allocate': calling MatchData.allocate is prohibited (TypeError)
However we aren't protecting `new` with this check, and can get uninitialized values that way (even though it has been undefined in the same way as `allocate`):
``` ruby
>> Class.instance_method(:new).bind_call(MatchData)
An error occurred when inspecting the object: #<TypeError: uninitialized MatchData>
Result of Kernel#inspect: #<MatchData:0x00007387517150b0>
We thought the reason for this special check rather than using `rb_undef_alloc_func` is so that `dup`/`clone` can still work, but that itself is problematic and a way users can get an uninitialized value. ``` ruby
match = "a".match(/./) => #<MatchData "a"> match.clone # wanting this to work is why we don't rb_undef_alloc_func (I think) => #<MatchData "a"> def match.initialize_copy(x); end => :initialize_copy match.clone # now this makes an uninitialized object An error occurred when inspecting the object: #<TypeError: uninitialized MatchData> Result of Kernel#inspect: #<MatchData:0x0000738752cfb370>
So I am back to believing we should remove the `rb_obj_respond_to` check, because it doesn't provide any safety. It seems like our only options are to tolerate uninitialized objects or use rb_undef_alloc_func.
----------------------------------------
Bug #21267: respond_to check in Class#allocate is inconsistent
https://bugs.ruby-lang.org/issues/21267#change-112883
* Author: jhawthorn (John Hawthorn)
* Status: Open
* Backport: 3.2: UNKNOWN, 3.3: UNKNOWN, 3.4: UNKNOWN
----------------------------------------
`Class#allocate` has an additional `rb_obj_respond_to(klass, rb_intern("allocate"))` check to forbid allocate being called on a class where it has been made private or undefined, even if used via ex. `bind_call`.
Rational.allocate (irb):1:in '<main>': undefined method 'allocate' for class Rational (NoMethodError) Class.instance_method(:allocate).bind_call(Rational) (irb):1:in 'Class#allocate': calling Rational.allocate is prohibited (TypeError)
However I don't think this provides any additional protection from users accessing an uninitialized object, as the user can redefine allocate to anything to bypass the check:
Class.instance_method(:allocate).bind_call(Class.new(Rational) {def self.allocate; end}) => (0/1)
Or even override `respond_to_missing?`
Class.instance_method(:allocate).bind_call(Class.new(Rational) {def self.respond_to_missing? *; true; end}) => (0/1)
So I think we should remove this check. For classes that we need to forbid allocation we should use `rb_undef_alloc_func`.
The classes I see this used for are:
* MatchData
* Refinement
* Module
* Complex
* Rational
My main motivation is that this check makes `Class#allocate` slow. There are ways we could improve that, but I don't think the check as-is is useful. If there's an alternative, more robust, check we'd like to do instead of simply removing this I'd be happy to implement it.
This makes `allocate` ~75% faster.
|allocate_no_params | 19.009M| 20.087M| | | -| 1.06x| |allocate_allocate | 20.587M| 35.882M| | | -| 1.74x| ``` https://github.com/ruby/ruby/pull/13116 -- https://bugs.ruby-lang.org/

Issue #21267 has been updated by Eregon (Benoit Daloze). I think it makes sense to replace or remove that "incorrect" check given it doesn't apply to `new`. The proper fix seems to also do the correct check in `new`. I think it's best to keep the possibility for 3rd party C extensions to undefine `allocate` (to prevent creating uninitialized objects) and still allow dup/clone (which should work fine if people don't monkey-patch `initialize_{copy,clone,dup}`, if they do the segfault/error is their responsibility then). With this change there is no protection in place left, so `Class.instance_method(:allocate).bind_call(MyType)` could segfault, even if MyType undefines `allocate`. Looking at the PR, how about setting `alloc_prohibited = true` when undefining `allocate`? (and set it back to false if it's defined again later) Another way would be to have separate allocation paths for "new object" and "copy" but this requires much more thought. Or prevent this case of "no new/allocate but dup/clone", but then for core types too for consistency (maybe too incompatible? maybe not?). Or core types could actually define dup & clone as overridden methods, and not use the alloc function at all and so use `rb_undef_alloc_func()`. That seems the cleanest actually, WDYT?
it doesn't provide any safety
Well, using bind_call is not rare and might even do it accidentally, OTOH defining a singleton `initialize_copy`/`allocate` method is pretty hardcore, and that should be no surprise it blows things up. ---------------------------------------- Bug #21267: respond_to check in Class#allocate is inconsistent https://bugs.ruby-lang.org/issues/21267#change-112888 * Author: jhawthorn (John Hawthorn) * Status: Open * Backport: 3.2: UNKNOWN, 3.3: UNKNOWN, 3.4: UNKNOWN ---------------------------------------- `Class#allocate` has an additional `rb_obj_respond_to(klass, rb_intern("allocate"))` check to forbid allocate being called on a class where it has been made private or undefined, even if used via ex. `bind_call`. ```
Rational.allocate (irb):1:in '<main>': undefined method 'allocate' for class Rational (NoMethodError) Class.instance_method(:allocate).bind_call(Rational) (irb):1:in 'Class#allocate': calling Rational.allocate is prohibited (TypeError)
However I don't think this provides any additional protection from users accessing an uninitialized object, as the user can redefine allocate to anything to bypass the check:
Class.instance_method(:allocate).bind_call(Class.new(Rational) {def self.allocate; end}) => (0/1)
Or even override `respond_to_missing?`
Class.instance_method(:allocate).bind_call(Class.new(Rational) {def self.respond_to_missing? *; true; end}) => (0/1)
So I think we should remove this check. For classes that we need to forbid allocation we should use `rb_undef_alloc_func`.
The classes I see this used for are:
* MatchData
* Refinement
* Module
* Complex
* Rational
My main motivation is that this check makes `Class#allocate` slow. There are ways we could improve that, but I don't think the check as-is is useful. If there's an alternative, more robust, check we'd like to do instead of simply removing this I'd be happy to implement it.
This makes `allocate` ~75% faster.
|allocate_no_params | 19.009M| 20.087M| | | -| 1.06x| |allocate_allocate | 20.587M| 35.882M| | | -| 1.74x| ``` https://github.com/ruby/ruby/pull/13116 -- https://bugs.ruby-lang.org/

Issue #21267 has been updated by jhawthorn (John Hawthorn).
Or core types could actually define dup & clone as overridden methods, and not use the alloc function at all and so use rb_undef_alloc_func(). That seems the cleanest actually, WDYT?
Yes. The only options that work are to `rb_undef_alloc_func` or returning something safe from the alloc func. The `alloc_prohibited` approach we tried is a dead end. As far as I can tell in current Ruby uninitialized MatchData, Refinement, Module, Complex, Rational do not crash the VM (if they do we need to fix that, because again, users can access that today). Which is why we should remove it. If we want to adjust those classes to `rb_undef_alloc_func` that seems like a good idea and we can do that later. ---------------------------------------- Bug #21267: respond_to check in Class#allocate is inconsistent https://bugs.ruby-lang.org/issues/21267#change-112968 * Author: jhawthorn (John Hawthorn) * Status: Open * Backport: 3.2: UNKNOWN, 3.3: UNKNOWN, 3.4: UNKNOWN ---------------------------------------- `Class#allocate` has an additional `rb_obj_respond_to(klass, rb_intern("allocate"))` check to forbid allocate being called on a class where it has been made private or undefined, even if used via ex. `bind_call`. ```
Rational.allocate (irb):1:in '<main>': undefined method 'allocate' for class Rational (NoMethodError) Class.instance_method(:allocate).bind_call(Rational) (irb):1:in 'Class#allocate': calling Rational.allocate is prohibited (TypeError)
However I don't think this provides any additional protection from users accessing an uninitialized object, as the user can redefine allocate to anything to bypass the check:
Class.instance_method(:allocate).bind_call(Class.new(Rational) {def self.allocate; end}) => (0/1)
Or even override `respond_to_missing?`
Class.instance_method(:allocate).bind_call(Class.new(Rational) {def self.respond_to_missing? *; true; end}) => (0/1)
So I think we should remove this check. For classes that we need to forbid allocation we should use `rb_undef_alloc_func`.
The classes I see this used for are:
* MatchData
* Refinement
* Module
* Complex
* Rational
My main motivation is that this check makes `Class#allocate` slow. There are ways we could improve that, but I don't think the check as-is is useful. If there's an alternative, more robust, check we'd like to do instead of simply removing this I'd be happy to implement it.
This makes `allocate` ~75% faster.
|allocate_no_params | 19.009M| 20.087M| | | -| 1.06x| |allocate_allocate | 20.587M| 35.882M| | | -| 1.74x| ``` https://github.com/ruby/ruby/pull/13116 -- https://bugs.ruby-lang.org/

Issue #21267 has been updated by Eregon (Benoit Daloze). Right. I think it's quite valuable to make those `rb_undef_alloc_func` to avoid having to handle uninitialized objects because that means extra checks (performance overhead) and less optimizations. For example if the Class#superclass internal field is not guaranteed to be constant, a JIT/VM might be more limited in what it can cache (or has to introduce an extra check to handle the uninitialized Class case). TruffleRuby already prevents `Class.allocate` for this reason. It'd be nice to do it for the other types you mentioned. Compatibility issues should be rare because it doesn't make much sense to allocate and initialize those types separately, nor to subclass them. ---------------------------------------- Bug #21267: respond_to check in Class#allocate is inconsistent https://bugs.ruby-lang.org/issues/21267#change-112982 * Author: jhawthorn (John Hawthorn) * Status: Open * Backport: 3.2: UNKNOWN, 3.3: UNKNOWN, 3.4: UNKNOWN ---------------------------------------- `Class#allocate` has an additional `rb_obj_respond_to(klass, rb_intern("allocate"))` check to forbid allocate being called on a class where it has been made private or undefined, even if used via ex. `bind_call`. ```
Rational.allocate (irb):1:in '<main>': undefined method 'allocate' for class Rational (NoMethodError) Class.instance_method(:allocate).bind_call(Rational) (irb):1:in 'Class#allocate': calling Rational.allocate is prohibited (TypeError)
However I don't think this provides any additional protection from users accessing an uninitialized object, as the user can redefine allocate to anything to bypass the check:
Class.instance_method(:allocate).bind_call(Class.new(Rational) {def self.allocate; end}) => (0/1)
Or even override `respond_to_missing?`
Class.instance_method(:allocate).bind_call(Class.new(Rational) {def self.respond_to_missing? *; true; end}) => (0/1)
So I think we should remove this check. For classes that we need to forbid allocation we should use `rb_undef_alloc_func`.
The classes I see this used for are:
* MatchData
* Refinement
* Module
* Complex
* Rational
My main motivation is that this check makes `Class#allocate` slow. There are ways we could improve that, but I don't think the check as-is is useful. If there's an alternative, more robust, check we'd like to do instead of simply removing this I'd be happy to implement it.
This makes `allocate` ~75% faster.
|allocate_no_params | 19.009M| 20.087M| | | -| 1.06x| |allocate_allocate | 20.587M| 35.882M| | | -| 1.74x| ``` https://github.com/ruby/ruby/pull/13116 -- https://bugs.ruby-lang.org/

Issue #21267 has been updated by jhawthorn (John Hawthorn). Status changed from Open to Closed I missed tagging this issue but the check was removed by https://github.com/ruby/ruby/commit/b0502e8f904f155ac077014077431ed5e54b9996 ---------------------------------------- Bug #21267: respond_to check in Class#allocate is inconsistent https://bugs.ruby-lang.org/issues/21267#change-113436 * Author: jhawthorn (John Hawthorn) * Status: Closed * Backport: 3.2: UNKNOWN, 3.3: UNKNOWN, 3.4: UNKNOWN ---------------------------------------- `Class#allocate` has an additional `rb_obj_respond_to(klass, rb_intern("allocate"))` check to forbid allocate being called on a class where it has been made private or undefined, even if used via ex. `bind_call`. ```
Rational.allocate (irb):1:in '<main>': undefined method 'allocate' for class Rational (NoMethodError) Class.instance_method(:allocate).bind_call(Rational) (irb):1:in 'Class#allocate': calling Rational.allocate is prohibited (TypeError)
However I don't think this provides any additional protection from users accessing an uninitialized object, as the user can redefine allocate to anything to bypass the check:
Class.instance_method(:allocate).bind_call(Class.new(Rational) {def self.allocate; end}) => (0/1)
Or even override `respond_to_missing?`
Class.instance_method(:allocate).bind_call(Class.new(Rational) {def self.respond_to_missing? *; true; end}) => (0/1)
So I think we should remove this check. For classes that we need to forbid allocation we should use `rb_undef_alloc_func`.
The classes I see this used for are:
* MatchData
* Refinement
* Module
* Complex
* Rational
My main motivation is that this check makes `Class#allocate` slow. There are ways we could improve that, but I don't think the check as-is is useful. If there's an alternative, more robust, check we'd like to do instead of simply removing this I'd be happy to implement it.
This makes `allocate` ~75% faster.
|allocate_no_params | 19.009M| 20.087M| | | -| 1.06x| |allocate_allocate | 20.587M| 35.882M| | | -| 1.74x| ``` https://github.com/ruby/ruby/pull/13116 -- https://bugs.ruby-lang.org/
participants (2)
-
Eregon (Benoit Daloze)
-
jhawthorn (John Hawthorn)