[ruby-core:124634] [Ruby Feature#21852] New improved allocator function interface
Issue #21852 has been reported by byroot (Jean Boussier). ---------------------------------------- Feature #21852: New improved allocator function interface https://bugs.ruby-lang.org/issues/21852 * Author: byroot (Jean Boussier) * Status: Open ---------------------------------------- When implementing native types with the `TypedData` API, You have to define an allocator function. That function receive the class to allocate and is supposed to return a new instance. ```c /** * This is the type of functions that ruby calls when trying to allocate an * object. It is sometimes necessary to allocate extra memory regions for an * object. When you define a class that uses ::RTypedData, it is typically the * case. On such situations define a function of this type and pass it to * rb_define_alloc_func(). * * @param[in] klass The class that this function is registered. * @return A newly allocated instance of `klass`. */ typedef VALUE (*rb_alloc_func_t)(VALUE klass); ``` ### Current API shortcomings There are a few limitations with the current API. #### Hard to disallow `.allocate` without breaking `#dup` and `#clone`. First, it is frequent for extensions to want to disable `Class#allocate` for their native types via `rb_undef_alloc_func`, as very often allowing uninitialized object would lead to bugs. The problem with `rb_undef_alloc_func` is that the alloc func is also used internally by `dup` and `clone`, so most types that undefine the allocator also prevent object copy without necessarily realizing it. If you want to both disable `Class#allocate` yet still allow copying, you need to entirely implement the `#dup` and `#clone` methods, which is non-trivial and very few types do. One notable exception is `Binding`, which has to implement these two methods: https://github.com/ruby/ruby/blob/bea48adbcacc29cce9536977e15ceba0d65c8a02/p... This works for Ruby code, however it doesn't work with C-level `rb_obj_dup(VALUE)`, as used by the Ractor logic to copy objects across ractors. In the case of `Binding` we probably wouldn't allow it anyway, but for other types it may be a problem. #### Can't support objects of variable width When duping or cloning an object of variable width, you need access to the original object to be able to allocate the right slot size. An example of that is `Thread::Backtrace` objects, as evidenced by [Bug #21818]. To support sending exception objects across ractors, we'd need to make `rb_obj_dup()` work for `Thread::Backtrace`, but to correctly duplicate a backtrace, the allocator needs to know the size. ### Proposed new API I'd like to propose a new API for defining allocators: ```ruby typedef VALUE (*rb_copy_alloc_func_t)(VALUE klass, VALUE other); ``` In addition to the class to allocate, the function also receives the instance to copy. When called by `Class#allocate`, the `other` argument is set to `Qundef`. Example usage: ```c static VALUE backtrace_alloc(VALUE klass, VALUE other) { rb_backtrace_t *bt; if (UNDEF_P(other)) { // Regular alloc return TypedData_Make_Struct(klass, rb_backtrace_t, &backtrace_data_type, bt); } else { // Copy rb_backtrace_t *other_bt; TypedData_Get_Struct(other, rb_backtrace_t, &backtrace_data_type, other_bt); VALUE self = backtrace_alloc_capa(other_bt->backtrace_size, &bt); bt->backtrace_size = other_bt->backtrace_size; MEMCPY(bt->backtrace, other_bt->backtrace, rb_backtrace_location_t, other_bt->backtrace_size); return self; } } ``` ### Backward compatibility Older-style allocator can keep being supported as long as we wish. The one backward potential compatibility concern is third party code that calls `rb_alloc_func_t rb_get_alloc_func(VALUE klass);`. As its documentation suggest, there's not much valid use case for it, but regardless we can keep supporting it by returning a "jump function". See `copy_allocator_adapter`: https://github.com/ruby/ruby/pull/15795/changes#diff-884a5a8a369ef1b4c7597e0... ### Opportunity for more changes? I was discussing this new interface with @ko1 and it appears that the current allocator interface may also be a limitation for Ractors and Ractor local GC. i.e. it might be useful to let the allocator function know that we're copying from one Ractor to another. But I know to little about Ractor local GC to make a proposition here, so I will let @ko1 make suggestions. ### Implementation I implemented this idea in https://github.com/ruby/ruby/pull/15795, to solve [Bug #21818]. It could remain a purely private API, but I think it would make sense to expose it. -- https://bugs.ruby-lang.org/
Issue #21852 has been updated by Eregon (Benoit Daloze). Are `initialize_copy/initialize_dup/initialize_clone` still called when using a `copy_allocator`? In the `backtrace_alloc` example you seem to already do some copying in that function, which then makes it unclear where the copying should be done.
First, it is frequent for extensions to want to disable `Class#allocate` for their native types via `rb_undef_alloc_func`, as very often allowing uninitialized object would lead to bugs.
How about using `rb_undef_method(klass, "allocate");` for such cases? I think that's a simple solution and requires no changes. Fundamentally there is a difference between an `allocate` method and an `alloc function`, new/dup/clone all require an `alloc function`, but they should not require (and they don't IIRC) an `allocate` method. ---------------------------------------- Feature #21852: New improved allocator function interface https://bugs.ruby-lang.org/issues/21852#change-116245 * Author: byroot (Jean Boussier) * Status: Open ---------------------------------------- When implementing native types with the `TypedData` API, You have to define an allocator function. That function receive the class to allocate and is supposed to return a new instance. ```c /** * This is the type of functions that ruby calls when trying to allocate an * object. It is sometimes necessary to allocate extra memory regions for an * object. When you define a class that uses ::RTypedData, it is typically the * case. On such situations define a function of this type and pass it to * rb_define_alloc_func(). * * @param[in] klass The class that this function is registered. * @return A newly allocated instance of `klass`. */ typedef VALUE (*rb_alloc_func_t)(VALUE klass); ``` ### Current API shortcomings There are a few limitations with the current API. #### Hard to disallow `.allocate` without breaking `#dup` and `#clone`. First, it is frequent for extensions to want to disable `Class#allocate` for their native types via `rb_undef_alloc_func`, as very often allowing uninitialized object would lead to bugs. The problem with `rb_undef_alloc_func` is that the alloc func is also used internally by `dup` and `clone`, so most types that undefine the allocator also prevent object copy without necessarily realizing it. If you want to both disable `Class#allocate` yet still allow copying, you need to entirely implement the `#dup` and `#clone` methods, which is non-trivial and very few types do. One notable exception is `Binding`, which has to implement these two methods: https://github.com/ruby/ruby/blob/bea48adbcacc29cce9536977e15ceba0d65c8a02/p... This works for Ruby code, however it doesn't work with C-level `rb_obj_dup(VALUE)`, as used by the Ractor logic to copy objects across ractors. In the case of `Binding` we probably wouldn't allow it anyway, but for other types it may be a problem. #### Can't support objects of variable width When duping or cloning an object of variable width, you need access to the original object to be able to allocate the right slot size. An example of that is `Thread::Backtrace` objects, as evidenced by [Bug #21818]. To support sending exception objects across ractors, we'd need to make `rb_obj_dup()` work for `Thread::Backtrace`, but to correctly duplicate a backtrace, the allocator needs to know the size. ### Proposed new API I'd like to propose a new API for defining allocators: ```ruby typedef VALUE (*rb_copy_alloc_func_t)(VALUE klass, VALUE other); ``` In addition to the class to allocate, the function also receives the instance to copy. When called by `Class#allocate`, the `other` argument is set to `Qundef`. Example usage: ```c static VALUE backtrace_alloc(VALUE klass, VALUE other) { rb_backtrace_t *bt; if (UNDEF_P(other)) { // Regular alloc return TypedData_Make_Struct(klass, rb_backtrace_t, &backtrace_data_type, bt); } else { // Copy rb_backtrace_t *other_bt; TypedData_Get_Struct(other, rb_backtrace_t, &backtrace_data_type, other_bt); VALUE self = backtrace_alloc_capa(other_bt->backtrace_size, &bt); bt->backtrace_size = other_bt->backtrace_size; MEMCPY(bt->backtrace, other_bt->backtrace, rb_backtrace_location_t, other_bt->backtrace_size); return self; } } ``` ### Backward compatibility Older-style allocator can keep being supported as long as we wish. The one backward potential compatibility concern is third party code that calls `rb_alloc_func_t rb_get_alloc_func(VALUE klass);`. As its documentation suggest, there's not much valid use case for it, but regardless we can keep supporting it by returning a "jump function". See `copy_allocator_adapter`: https://github.com/ruby/ruby/pull/15795/changes#diff-884a5a8a369ef1b4c7597e0... ### Opportunity for more changes? I was discussing this new interface with @ko1 and it appears that the current allocator interface may also be a limitation for Ractors and Ractor local GC. i.e. it might be useful to let the allocator function know that we're copying from one Ractor to another. But I know to little about Ractor local GC to make a proposition here, so I will let @ko1 make suggestions. ### Implementation I implemented this idea in https://github.com/ruby/ruby/pull/15795, to solve [Bug #21818]. It could remain a purely private API, but I think it would make sense to expose it. -- https://bugs.ruby-lang.org/
Issue #21852 has been updated by byroot (Jean Boussier).
Are initialize_copy/initialize_dup/initialize_clone still called when using a copy_allocator?
Yes, it is unchanged.
In the backtrace_alloc example you seem to already do some copying in that function, which then makes it unclear where the copying should be done.
Indeed. Technically it wouldn't be required, but I think it's more reliable to do it there than in `initialize_copy` as the later could e redefined and cause corruption.
How about using rb_undef_method(klass, "allocate"); for such cases?
It's a corner case, but that allows redefining it later on. ---------------------------------------- Feature #21852: New improved allocator function interface https://bugs.ruby-lang.org/issues/21852#change-116246 * Author: byroot (Jean Boussier) * Status: Open ---------------------------------------- When implementing native types with the `TypedData` API, You have to define an allocator function. That function receive the class to allocate and is supposed to return a new instance. ```c /** * This is the type of functions that ruby calls when trying to allocate an * object. It is sometimes necessary to allocate extra memory regions for an * object. When you define a class that uses ::RTypedData, it is typically the * case. On such situations define a function of this type and pass it to * rb_define_alloc_func(). * * @param[in] klass The class that this function is registered. * @return A newly allocated instance of `klass`. */ typedef VALUE (*rb_alloc_func_t)(VALUE klass); ``` ### Current API shortcomings There are a few limitations with the current API. #### Hard to disallow `.allocate` without breaking `#dup` and `#clone`. First, it is frequent for extensions to want to disable `Class#allocate` for their native types via `rb_undef_alloc_func`, as very often allowing uninitialized object would lead to bugs. The problem with `rb_undef_alloc_func` is that the alloc func is also used internally by `dup` and `clone`, so most types that undefine the allocator also prevent object copy without necessarily realizing it. If you want to both disable `Class#allocate` yet still allow copying, you need to entirely implement the `#dup` and `#clone` methods, which is non-trivial and very few types do. One notable exception is `Binding`, which has to implement these two methods: https://github.com/ruby/ruby/blob/bea48adbcacc29cce9536977e15ceba0d65c8a02/p... This works for Ruby code, however it doesn't work with C-level `rb_obj_dup(VALUE)`, as used by the Ractor logic to copy objects across ractors. In the case of `Binding` we probably wouldn't allow it anyway, but for other types it may be a problem. #### Can't support objects of variable width When duping or cloning an object of variable width, you need access to the original object to be able to allocate the right slot size. An example of that is `Thread::Backtrace` objects, as evidenced by [Bug #21818]. To support sending exception objects across ractors, we'd need to make `rb_obj_dup()` work for `Thread::Backtrace`, but to correctly duplicate a backtrace, the allocator needs to know the size. ### Proposed new API I'd like to propose a new API for defining allocators: ```ruby typedef VALUE (*rb_copy_alloc_func_t)(VALUE klass, VALUE other); ``` In addition to the class to allocate, the function also receives the instance to copy. When called by `Class#allocate`, the `other` argument is set to `Qundef`. Example usage: ```c static VALUE backtrace_alloc(VALUE klass, VALUE other) { rb_backtrace_t *bt; if (UNDEF_P(other)) { // Regular alloc return TypedData_Make_Struct(klass, rb_backtrace_t, &backtrace_data_type, bt); } else { // Copy rb_backtrace_t *other_bt; TypedData_Get_Struct(other, rb_backtrace_t, &backtrace_data_type, other_bt); VALUE self = backtrace_alloc_capa(other_bt->backtrace_size, &bt); bt->backtrace_size = other_bt->backtrace_size; MEMCPY(bt->backtrace, other_bt->backtrace, rb_backtrace_location_t, other_bt->backtrace_size); return self; } } ``` ### Backward compatibility Older-style allocator can keep being supported as long as we wish. The one backward potential compatibility concern is third party code that calls `rb_alloc_func_t rb_get_alloc_func(VALUE klass);`. As its documentation suggest, there's not much valid use case for it, but regardless we can keep supporting it by returning a "jump function". See `copy_allocator_adapter`: https://github.com/ruby/ruby/pull/15795/changes#diff-884a5a8a369ef1b4c7597e0... ### Opportunity for more changes? I was discussing this new interface with @ko1 and it appears that the current allocator interface may also be a limitation for Ractors and Ractor local GC. i.e. it might be useful to let the allocator function know that we're copying from one Ractor to another. But I know to little about Ractor local GC to make a proposition here, so I will let @ko1 make suggestions. ### Implementation I implemented this idea in https://github.com/ruby/ruby/pull/15795, to solve [Bug #21818]. It could remain a purely private API, but I think it would make sense to expose it. -- https://bugs.ruby-lang.org/
Issue #21852 has been updated by Eregon (Benoit Daloze). byroot (Jean Boussier) wrote in #note-2:
Indeed. Technically it wouldn't be required, but I think it's more reliable to do it there than in `initialize_copy` as the later could be redefined and cause corruption.
That could cause leaks if copying state involves extra allocations though, as a previously-existing `initialize_copy` might allocate and just set the pointers, but not free that first copy done in the `copy_allocator`. It makes the contract unclear about what is supposed to copy what. I think we need to trust `initialize_copy`, or invent a new Ruby-level protocol for copying objects. Inventing a new Ruby-level protocol for copying objects and for creation without uninitialized state would be great. Some core classes already do this but then they typically don't support dup/clone. It'd be great to have this in general, so one could write classes that never have to care about uninitialized state since there are no instances in that uninitialized state ever. We'd have some method to do both allocation + initialization at once, and another method to create a copy + initialize it as one call.
How about using rb_undef_method(klass, "allocate"); for such cases?
It's a corner case, but that allows redefining it later on.
I don't think we need to worry about this corner case. Such things are clearly violating internals of the class, and then they might as well `rb_define_alloc_func()` and break it too. But I suppose for core classes there might be a point to try to not segfault in that case, mmh. Maybe classes should have a flag for "allow/disallow Class#allocate" and then `Class#allocate` would check that? There is already a check for singleton classes, so we could merge it with that check for free and just have singleton classes always set that flag to false. ---------------------------------------- Feature #21852: New improved allocator function interface https://bugs.ruby-lang.org/issues/21852#change-116247 * Author: byroot (Jean Boussier) * Status: Open ---------------------------------------- When implementing native types with the `TypedData` API, You have to define an allocator function. That function receive the class to allocate and is supposed to return a new instance. ```c /** * This is the type of functions that ruby calls when trying to allocate an * object. It is sometimes necessary to allocate extra memory regions for an * object. When you define a class that uses ::RTypedData, it is typically the * case. On such situations define a function of this type and pass it to * rb_define_alloc_func(). * * @param[in] klass The class that this function is registered. * @return A newly allocated instance of `klass`. */ typedef VALUE (*rb_alloc_func_t)(VALUE klass); ``` ### Current API shortcomings There are a few limitations with the current API. #### Hard to disallow `.allocate` without breaking `#dup` and `#clone`. First, it is frequent for extensions to want to disable `Class#allocate` for their native types via `rb_undef_alloc_func`, as very often allowing uninitialized object would lead to bugs. The problem with `rb_undef_alloc_func` is that the alloc func is also used internally by `dup` and `clone`, so most types that undefine the allocator also prevent object copy without necessarily realizing it. If you want to both disable `Class#allocate` yet still allow copying, you need to entirely implement the `#dup` and `#clone` methods, which is non-trivial and very few types do. One notable exception is `Binding`, which has to implement these two methods: https://github.com/ruby/ruby/blob/bea48adbcacc29cce9536977e15ceba0d65c8a02/p... This works for Ruby code, however it doesn't work with C-level `rb_obj_dup(VALUE)`, as used by the Ractor logic to copy objects across ractors. In the case of `Binding` we probably wouldn't allow it anyway, but for other types it may be a problem. #### Can't support objects of variable width When duping or cloning an object of variable width, you need access to the original object to be able to allocate the right slot size. An example of that is `Thread::Backtrace` objects, as evidenced by [Bug #21818]. To support sending exception objects across ractors, we'd need to make `rb_obj_dup()` work for `Thread::Backtrace`, but to correctly duplicate a backtrace, the allocator needs to know the size. ### Proposed new API I'd like to propose a new API for defining allocators: ```ruby typedef VALUE (*rb_copy_alloc_func_t)(VALUE klass, VALUE other); ``` In addition to the class to allocate, the function also receives the instance to copy. When called by `Class#allocate`, the `other` argument is set to `Qundef`. Example usage: ```c static VALUE backtrace_alloc(VALUE klass, VALUE other) { rb_backtrace_t *bt; if (UNDEF_P(other)) { // Regular alloc return TypedData_Make_Struct(klass, rb_backtrace_t, &backtrace_data_type, bt); } else { // Copy rb_backtrace_t *other_bt; TypedData_Get_Struct(other, rb_backtrace_t, &backtrace_data_type, other_bt); VALUE self = backtrace_alloc_capa(other_bt->backtrace_size, &bt); bt->backtrace_size = other_bt->backtrace_size; MEMCPY(bt->backtrace, other_bt->backtrace, rb_backtrace_location_t, other_bt->backtrace_size); return self; } } ``` ### Backward compatibility Older-style allocator can keep being supported as long as we wish. The one backward potential compatibility concern is third party code that calls `rb_alloc_func_t rb_get_alloc_func(VALUE klass);`. As its documentation suggest, there's not much valid use case for it, but regardless we can keep supporting it by returning a "jump function". See `copy_allocator_adapter`: https://github.com/ruby/ruby/pull/15795/changes#diff-884a5a8a369ef1b4c7597e0... ### Opportunity for more changes? I was discussing this new interface with @ko1 and it appears that the current allocator interface may also be a limitation for Ractors and Ractor local GC. i.e. it might be useful to let the allocator function know that we're copying from one Ractor to another. But I know to little about Ractor local GC to make a proposition here, so I will let @ko1 make suggestions. ### Implementation I implemented this idea in https://github.com/ruby/ruby/pull/15795, to solve [Bug #21818]. It could remain a purely private API, but I think it would make sense to expose it. -- https://bugs.ruby-lang.org/
Issue #21852 has been updated by byroot (Jean Boussier).
Inventing a new Ruby-level protocol for copying objects and for creation without uninitialized state would be great.
Yes, this is somewhat what this new allocator API does. It also solves the problem that the Ractor API must be able to clone objects but can hardly trust user defined `initialize_copy` methods. ---------------------------------------- Feature #21852: New improved allocator function interface https://bugs.ruby-lang.org/issues/21852#change-116248 * Author: byroot (Jean Boussier) * Status: Open ---------------------------------------- When implementing native types with the `TypedData` API, You have to define an allocator function. That function receive the class to allocate and is supposed to return a new instance. ```c /** * This is the type of functions that ruby calls when trying to allocate an * object. It is sometimes necessary to allocate extra memory regions for an * object. When you define a class that uses ::RTypedData, it is typically the * case. On such situations define a function of this type and pass it to * rb_define_alloc_func(). * * @param[in] klass The class that this function is registered. * @return A newly allocated instance of `klass`. */ typedef VALUE (*rb_alloc_func_t)(VALUE klass); ``` ### Current API shortcomings There are a few limitations with the current API. #### Hard to disallow `.allocate` without breaking `#dup` and `#clone`. First, it is frequent for extensions to want to disable `Class#allocate` for their native types via `rb_undef_alloc_func`, as very often allowing uninitialized object would lead to bugs. The problem with `rb_undef_alloc_func` is that the alloc func is also used internally by `dup` and `clone`, so most types that undefine the allocator also prevent object copy without necessarily realizing it. If you want to both disable `Class#allocate` yet still allow copying, you need to entirely implement the `#dup` and `#clone` methods, which is non-trivial and very few types do. One notable exception is `Binding`, which has to implement these two methods: https://github.com/ruby/ruby/blob/bea48adbcacc29cce9536977e15ceba0d65c8a02/p... This works for Ruby code, however it doesn't work with C-level `rb_obj_dup(VALUE)`, as used by the Ractor logic to copy objects across ractors. In the case of `Binding` we probably wouldn't allow it anyway, but for other types it may be a problem. #### Can't support objects of variable width When duping or cloning an object of variable width, you need access to the original object to be able to allocate the right slot size. An example of that is `Thread::Backtrace` objects, as evidenced by [Bug #21818]. To support sending exception objects across ractors, we'd need to make `rb_obj_dup()` work for `Thread::Backtrace`, but to correctly duplicate a backtrace, the allocator needs to know the size. ### Proposed new API I'd like to propose a new API for defining allocators: ```ruby typedef VALUE (*rb_copy_alloc_func_t)(VALUE klass, VALUE other); ``` In addition to the class to allocate, the function also receives the instance to copy. When called by `Class#allocate`, the `other` argument is set to `Qundef`. Example usage: ```c static VALUE backtrace_alloc(VALUE klass, VALUE other) { rb_backtrace_t *bt; if (UNDEF_P(other)) { // Regular alloc return TypedData_Make_Struct(klass, rb_backtrace_t, &backtrace_data_type, bt); } else { // Copy rb_backtrace_t *other_bt; TypedData_Get_Struct(other, rb_backtrace_t, &backtrace_data_type, other_bt); VALUE self = backtrace_alloc_capa(other_bt->backtrace_size, &bt); bt->backtrace_size = other_bt->backtrace_size; MEMCPY(bt->backtrace, other_bt->backtrace, rb_backtrace_location_t, other_bt->backtrace_size); return self; } } ``` ### Backward compatibility Older-style allocator can keep being supported as long as we wish. The one backward potential compatibility concern is third party code that calls `rb_alloc_func_t rb_get_alloc_func(VALUE klass);`. As its documentation suggest, there's not much valid use case for it, but regardless we can keep supporting it by returning a "jump function". See `copy_allocator_adapter`: https://github.com/ruby/ruby/pull/15795/changes#diff-884a5a8a369ef1b4c7597e0... ### Opportunity for more changes? I was discussing this new interface with @ko1 and it appears that the current allocator interface may also be a limitation for Ractors and Ractor local GC. i.e. it might be useful to let the allocator function know that we're copying from one Ractor to another. But I know to little about Ractor local GC to make a proposition here, so I will let @ko1 make suggestions. ### Implementation I implemented this idea in https://github.com/ruby/ruby/pull/15795, to solve [Bug #21818]. It could remain a purely private API, but I think it would make sense to expose it. -- https://bugs.ruby-lang.org/
Issue #21852 has been updated by Eregon (Benoit Daloze). It only does it for classes defined in C, and if they do all state copying in `copy_allocator`. I think this would be valuable to have for any class, i.e. also for classes defined in Ruby and not in C. ---------------------------------------- Feature #21852: New improved allocator function interface https://bugs.ruby-lang.org/issues/21852#change-116249 * Author: byroot (Jean Boussier) * Status: Open ---------------------------------------- When implementing native types with the `TypedData` API, You have to define an allocator function. That function receive the class to allocate and is supposed to return a new instance. ```c /** * This is the type of functions that ruby calls when trying to allocate an * object. It is sometimes necessary to allocate extra memory regions for an * object. When you define a class that uses ::RTypedData, it is typically the * case. On such situations define a function of this type and pass it to * rb_define_alloc_func(). * * @param[in] klass The class that this function is registered. * @return A newly allocated instance of `klass`. */ typedef VALUE (*rb_alloc_func_t)(VALUE klass); ``` ### Current API shortcomings There are a few limitations with the current API. #### Hard to disallow `.allocate` without breaking `#dup` and `#clone`. First, it is frequent for extensions to want to disable `Class#allocate` for their native types via `rb_undef_alloc_func`, as very often allowing uninitialized object would lead to bugs. The problem with `rb_undef_alloc_func` is that the alloc func is also used internally by `dup` and `clone`, so most types that undefine the allocator also prevent object copy without necessarily realizing it. If you want to both disable `Class#allocate` yet still allow copying, you need to entirely implement the `#dup` and `#clone` methods, which is non-trivial and very few types do. One notable exception is `Binding`, which has to implement these two methods: https://github.com/ruby/ruby/blob/bea48adbcacc29cce9536977e15ceba0d65c8a02/p... This works for Ruby code, however it doesn't work with C-level `rb_obj_dup(VALUE)`, as used by the Ractor logic to copy objects across ractors. In the case of `Binding` we probably wouldn't allow it anyway, but for other types it may be a problem. #### Can't support objects of variable width When duping or cloning an object of variable width, you need access to the original object to be able to allocate the right slot size. An example of that is `Thread::Backtrace` objects, as evidenced by [Bug #21818]. To support sending exception objects across ractors, we'd need to make `rb_obj_dup()` work for `Thread::Backtrace`, but to correctly duplicate a backtrace, the allocator needs to know the size. ### Proposed new API I'd like to propose a new API for defining allocators: ```ruby typedef VALUE (*rb_copy_alloc_func_t)(VALUE klass, VALUE other); ``` In addition to the class to allocate, the function also receives the instance to copy. When called by `Class#allocate`, the `other` argument is set to `Qundef`. Example usage: ```c static VALUE backtrace_alloc(VALUE klass, VALUE other) { rb_backtrace_t *bt; if (UNDEF_P(other)) { // Regular alloc return TypedData_Make_Struct(klass, rb_backtrace_t, &backtrace_data_type, bt); } else { // Copy rb_backtrace_t *other_bt; TypedData_Get_Struct(other, rb_backtrace_t, &backtrace_data_type, other_bt); VALUE self = backtrace_alloc_capa(other_bt->backtrace_size, &bt); bt->backtrace_size = other_bt->backtrace_size; MEMCPY(bt->backtrace, other_bt->backtrace, rb_backtrace_location_t, other_bt->backtrace_size); return self; } } ``` ### Backward compatibility Older-style allocator can keep being supported as long as we wish. The one backward potential compatibility concern is third party code that calls `rb_alloc_func_t rb_get_alloc_func(VALUE klass);`. As its documentation suggest, there's not much valid use case for it, but regardless we can keep supporting it by returning a "jump function". See `copy_allocator_adapter`: https://github.com/ruby/ruby/pull/15795/changes#diff-884a5a8a369ef1b4c7597e0... ### Opportunity for more changes? I was discussing this new interface with @ko1 and it appears that the current allocator interface may also be a limitation for Ractors and Ractor local GC. i.e. it might be useful to let the allocator function know that we're copying from one Ractor to another. But I know to little about Ractor local GC to make a proposition here, so I will let @ko1 make suggestions. ### Implementation I implemented this idea in https://github.com/ruby/ruby/pull/15795, to solve [Bug #21818]. It could remain a purely private API, but I think it would make sense to expose it. -- https://bugs.ruby-lang.org/
participants (2)
-
byroot (Jean Boussier) -
Eregon (Benoit Daloze)