[ruby-core:124635] [Ruby Feature#21853] Make Embedded TypedData a public API
Issue #21853 has been reported by byroot (Jean Boussier). ---------------------------------------- Feature #21853: Make Embedded TypedData a public API https://bugs.ruby-lang.org/issues/21853 * Author: byroot (Jean Boussier) * Status: Open ---------------------------------------- As part of Ruby 3.3, we added a private `RUBY_TYPED_EMBEDDABLE` flag to the `TypedData` API to allow `TypedData` to use variable width allocation. Technically, we inadvertently exposed that flag in public headers so third party extensions can make use of it, but it's not considered public API as it's not documented, so it would be a poor decision. This API has both memory and speed benefits as it allow to avoid some `malloc/free` churn, reduce pointer chasing, etc. For instance, when we converted `Time` to be embedded, it improved allocation performance by 30% and also reduced memory usage by 20%: https://github.com/ruby/ruby/commit/aa6642de630cfc10063154d84e45a7bff30e9103 I believe numerous third party native extensions could benefit from it (I would certainly make use of it in `ruby/json`), now that we used it internally for several years, I'd like to work on making it a public API for Ruby 4.1 -- https://bugs.ruby-lang.org/
Issue #21853 has been updated by Eregon (Benoit Daloze). I'm thinking about this in the context of TruffleRuby, where `RTypedData` never moves (it's allocated via system `calloc()`). I think the best then would be to ignore this new flag entirely, and so the public API should be done in a way that it can be implemented as if it's not embedded. Related: https://github.com/truffleruby/truffleruby/issues/4130 So on TruffleRuby I think we could always use the same allocation for the `RTypedData` + `data` struct, when using `TypedData_Make_Struct()`, effectively the same as embedded TypedData but never moving. But not when using `TypedData_Wrap_Struct()` since that uses an existing data pointer. ---------------------------------------- Feature #21853: Make Embedded TypedData a public API https://bugs.ruby-lang.org/issues/21853#change-116224 * Author: byroot (Jean Boussier) * Status: Open ---------------------------------------- As part of Ruby 3.3, we added a private `RUBY_TYPED_EMBEDDABLE` flag to the `TypedData` API to allow `TypedData` to use variable width allocation. Technically, we inadvertently exposed that flag in public headers so third party extensions can make use of it, but it's not considered public API as it's not documented, so it would be a poor decision. This API has both memory and speed benefits as it allow to avoid some `malloc/free` churn, reduce pointer chasing, etc. For instance, when we converted `Time` to be embedded, it improved allocation performance by 30% and also reduced memory usage by 20%: https://github.com/ruby/ruby/commit/aa6642de630cfc10063154d84e45a7bff30e9103 I believe numerous third party native extensions could benefit from it (I would certainly make use of it in `ruby/json`), now that we used it internally for several years, I'd like to work on making it a public API for Ruby 4.1 -- https://bugs.ruby-lang.org/
Issue #21853 has been updated by byroot (Jean Boussier).
So on TruffleRuby I think we could always use the same allocation for the RTypedData + data struct, when using TypedData_Make_Struct(), effectively the same as embedded TypedData but never moving.
I don't think so, because you still need to support `DATA_PTR(obj) = ptr`, which isn't allowed for embedded typed datas. ---------------------------------------- Feature #21853: Make Embedded TypedData a public API https://bugs.ruby-lang.org/issues/21853#change-116225 * Author: byroot (Jean Boussier) * Status: Open ---------------------------------------- As part of Ruby 3.3, we added a private `RUBY_TYPED_EMBEDDABLE` flag to the `TypedData` API to allow `TypedData` to use variable width allocation. Technically, we inadvertently exposed that flag in public headers so third party extensions can make use of it, but it's not considered public API as it's not documented, so it would be a poor decision. This API has both memory and speed benefits as it allow to avoid some `malloc/free` churn, reduce pointer chasing, etc. For instance, when we converted `Time` to be embedded, it improved allocation performance by 30% and also reduced memory usage by 20%: https://github.com/ruby/ruby/commit/aa6642de630cfc10063154d84e45a7bff30e9103 I believe numerous third party native extensions could benefit from it (I would certainly make use of it in `ruby/json`), now that we used it internally for several years, I'd like to work on making it a public API for Ruby 4.1 -- https://bugs.ruby-lang.org/
Issue #21853 has been updated by Eregon (Benoit Daloze). Good point! How do embedded typed datas handle this, do they raise an exception in such a case? Seems tricky given the `DATA_PTR(obj)` API returning a pointer. I'd actually love if we had a separate API for changing the data pointer as a macro or function (e.g. `RTYPEDDATA_SET_DATA(obj, new_data_pointer)` to follow `RTYPEDDATA_GET_DATA`), so we know better when it can be changed. Currently we have to workaround in TruffleRuby that after every native call that accesses a T_DATA we have to check if the data pointer has changed :/ Of course we wouldn't be able to remove `DATA_PTR()` yet, but we could maybe deprecate it and at some make it return a `const` pointer or so to prevent writes. ---------------------------------------- Feature #21853: Make Embedded TypedData a public API https://bugs.ruby-lang.org/issues/21853#change-116241 * Author: byroot (Jean Boussier) * Status: Open ---------------------------------------- As part of Ruby 3.3, we added a private `RUBY_TYPED_EMBEDDABLE` flag to the `TypedData` API to allow `TypedData` to use variable width allocation. Technically, we inadvertently exposed that flag in public headers so third party extensions can make use of it, but it's not considered public API as it's not documented, so it would be a poor decision. This API has both memory and speed benefits as it allow to avoid some `malloc/free` churn, reduce pointer chasing, etc. For instance, when we converted `Time` to be embedded, it improved allocation performance by 30% and also reduced memory usage by 20%: https://github.com/ruby/ruby/commit/aa6642de630cfc10063154d84e45a7bff30e9103 I believe numerous third party native extensions could benefit from it (I would certainly make use of it in `ruby/json`), now that we used it internally for several years, I'd like to work on making it a public API for Ruby 4.1 -- https://bugs.ruby-lang.org/
Issue #21853 has been updated by byroot (Jean Boussier).
How do embedded typed datas handle this, do they raise an exception in such a case?
Unfortunately not. It end up with data corruption.
I'd actually love if we had a separate API for changing the data pointer as a macro or function
Makes sense. ---------------------------------------- Feature #21853: Make Embedded TypedData a public API https://bugs.ruby-lang.org/issues/21853#change-116244 * Author: byroot (Jean Boussier) * Status: Open ---------------------------------------- As part of Ruby 3.3, we added a private `RUBY_TYPED_EMBEDDABLE` flag to the `TypedData` API to allow `TypedData` to use variable width allocation. Technically, we inadvertently exposed that flag in public headers so third party extensions can make use of it, but it's not considered public API as it's not documented, so it would be a poor decision. This API has both memory and speed benefits as it allow to avoid some `malloc/free` churn, reduce pointer chasing, etc. For instance, when we converted `Time` to be embedded, it improved allocation performance by 30% and also reduced memory usage by 20%: https://github.com/ruby/ruby/commit/aa6642de630cfc10063154d84e45a7bff30e9103 I believe numerous third party native extensions could benefit from it (I would certainly make use of it in `ruby/json`), now that we used it internally for several years, I'd like to work on making it a public API for Ruby 4.1 -- https://bugs.ruby-lang.org/
participants (2)
-
byroot (Jean Boussier) -
Eregon (Benoit Daloze)