Issue #21963 has been updated by Eregon (Benoit Daloze). I realized these `init` & `copy` C function hooks could actually be done partly with the proposal in #21852, cc @byroot. Specifically, the `rb_copy_alloc_func_t` gets the original object, so that's equivalent to `copy` and it would be called almost at the right time. And that function can then correctly initialize the C parts of the object so it's valid (at least can't cause segfaults) after it returns. The one difference in timing is for `clone` it would be called before the singleton class is copied & set (in case the original object has a singleton class), doesn't seem much of an issue. The missing part is the `rb_copy_alloc_func_t` when called from `Class#new` doesn't receive the arguments and so it is hard to properly initialize the C structs without the arguments. So maybe the new allocator function should be like: ```c typedef VALUE (*rb_copy_alloc_func_t)(VALUE klass, VALUE original, int initialize_argc, const VALUE *initialize_argv); ``` or so, and either `original` (when called from `dup`/`clone`) or `initialize_argc + initialize_argv` would be set (when called from `Class#new`). ---------------------------------------- Feature #21963: A solution to completely avoid allocated-but-uninitialized objects https://bugs.ruby-lang.org/issues/21963#change-116874 * Author: Eregon (Benoit Daloze) * Status: Open ---------------------------------------- A common issue when defining a class is to handle allocated-but-uninitialized objects. For example: ```ruby obj = MyClass.allocate obj.some_method ``` This can easily segfault for classes defined in C and raise an unclear exception for classes defined in Ruby. As a workaround many core (and non-core) classes add a check that they are initialized in *every* instance method. This is suboptimal for performance and correctness, classes should not need to care about allocated-but-uninitialized objects. Fundamentally, to solve this we need to guarantee that after the allocation function is used that either `initialize`, `initialize_dup` or `initialize_clone` is called. And we can't guarantee that for `Class#allocate`. The current workarounds are: * `undef allocate`, but this does not prevent `Class.instance_method(:allocate).bind_call(Foo)`. * `rb_undef_alloc_func()` but this breaks `dup`, `clone` and `Marshal`. The idea is to have in addition of the `public alloc function` (in `rb_classext_struct.as.class.allocator`) an `internal alloc function`. Then: * `Class#new`, `dup`, `clone` and `Marshal` always use the internal alloc function, because they guarantee to call `initialize`, `initialize_dup` or `initialize_clone`. * `rb_define_alloc_func()` sets both fields. * `rb_undef_alloc_func()` sets both fields. * `rb_get_alloc_func()` reads the public alloc function (unchanged) * `Class#allocate` uses the public alloc function (unchanged) We add a new method on `Class`, for example `Class#safe_initialization`, which: * Sets the public alloc function to `UNDEF_ALLOC_FUNC`, same as `rb_undef_alloc_func()`, so `Class#allocate` and `rb_get_alloc_func()` will raise if they are used (as they are unsafe). * Preserves the internal alloc function so `Class#new`, `dup`, `clone` and `Marshal` keep working. After that the class has fully safe intialization and does not need to worry about allocated-but-uninitialized objects anymore. From https://bugs.ruby-lang.org/issues/21852#note-7 -- https://bugs.ruby-lang.org/