Issue #21852 has been reported by byroot (Jean Boussier). ---------------------------------------- Feature #21852: New improved allocator function interface https://bugs.ruby-lang.org/issues/21852 * Author: byroot (Jean Boussier) * Status: Open ---------------------------------------- When implementing native types with the `TypedData` API, You have to define an allocator function. That function receive the class to allocate and is supposed to return a new instance. ```c /** * This is the type of functions that ruby calls when trying to allocate an * object. It is sometimes necessary to allocate extra memory regions for an * object. When you define a class that uses ::RTypedData, it is typically the * case. On such situations define a function of this type and pass it to * rb_define_alloc_func(). * * @param[in] klass The class that this function is registered. * @return A newly allocated instance of `klass`. */ typedef VALUE (*rb_alloc_func_t)(VALUE klass); ``` ### Current API shortcomings There are a few limitations with the current API. #### Hard to disallow `.allocate` without breaking `#dup` and `#clone`. First, it is frequent for extensions to want to disable `Class#allocate` for their native types via `rb_undef_alloc_func`, as very often allowing uninitialized object would lead to bugs. The problem with `rb_undef_alloc_func` is that the alloc func is also used internally by `dup` and `clone`, so most types that undefine the allocator also prevent object copy without necessarily realizing it. If you want to both disable `Class#allocate` yet still allow copying, you need to entirely implement the `#dup` and `#clone` methods, which is non-trivial and very few types do. One notable exception is `Binding`, which has to implement these two methods: https://github.com/ruby/ruby/blob/bea48adbcacc29cce9536977e15ceba0d65c8a02/p... This works for Ruby code, however it doesn't work with C-level `rb_obj_dup(VALUE)`, as used by the Ractor logic to copy objects across ractors. In the case of `Binding` we probably wouldn't allow it anyway, but for other types it may be a problem. #### Can't support objects of variable width When duping or cloning an object of variable width, you need access to the original object to be able to allocate the right slot size. An example of that is `Thread::Backtrace` objects, as evidenced by [Bug #21818]. To support sending exception objects across ractors, we'd need to make `rb_obj_dup()` work for `Thread::Backtrace`, but to correctly duplicate a backtrace, the allocator needs to know the size. ### Proposed new API I'd like to propose a new API for defining allocators: ```ruby typedef VALUE (*rb_copy_alloc_func_t)(VALUE klass, VALUE other); ``` In addition to the class to allocate, the function also receives the instance to copy. When called by `Class#allocate`, the `other` argument is set to `Qundef`. Example usage: ```c static VALUE backtrace_alloc(VALUE klass, VALUE other) { rb_backtrace_t *bt; if (UNDEF_P(other)) { // Regular alloc return TypedData_Make_Struct(klass, rb_backtrace_t, &backtrace_data_type, bt); } else { // Copy rb_backtrace_t *other_bt; TypedData_Get_Struct(other, rb_backtrace_t, &backtrace_data_type, other_bt); VALUE self = backtrace_alloc_capa(other_bt->backtrace_size, &bt); bt->backtrace_size = other_bt->backtrace_size; MEMCPY(bt->backtrace, other_bt->backtrace, rb_backtrace_location_t, other_bt->backtrace_size); return self; } } ``` ### Backward compatibility Older-style allocator can keep being supported as long as we wish. The one backward potential compatibility concern is third party code that calls `rb_alloc_func_t rb_get_alloc_func(VALUE klass);`. As its documentation suggest, there's not much valid use case for it, but regardless we can keep supporting it by returning a "jump function". See `copy_allocator_adapter`: https://github.com/ruby/ruby/pull/15795/changes#diff-884a5a8a369ef1b4c7597e0... ### Opportunity for more changes? I was discussing this new interface with @ko1 and it appears that the current allocator interface may also be a limitation for Ractors and Ractor local GC. i.e. it might be useful to let the allocator function know that we're copying from one Ractor to another. But I know to little about Ractor local GC to make a proposition here, so I will let @ko1 make suggestions. ### Implementation I implemented this idea in https://github.com/ruby/ruby/pull/15795, to solve [Bug #21818]. It could remain a purely private API, but I think it would make sense to expose it. -- https://bugs.ruby-lang.org/