Issue #22008 has been reported by jhawthorn (John Hawthorn). ---------------------------------------- Feature #22008: RUBY_INTERNAL_EVENT_NEWOBJ should run earlier, with fully uninitialized object https://bugs.ruby-lang.org/issues/22008 * Author: jhawthorn (John Hawthorn) * Status: Open * Assignee: jhawthorn (John Hawthorn) ---------------------------------------- `RUBY_INTERNAL_EVENT_NEWOBJ` is an internal tracepoint event, accessible only to C. I'd consider it an unstable semi-private API, really intended to be used by only ObjectSpace. The documentation states: * in internal events, **you can not use any Ruby APIs** (even object creations) * Limitations are MRI version specific Basically, it's unsafe to do use any Ruby APIs in the hook. An exception is `rb_profile_frames`, but I believe everything else should not be allowed. Currently the `RUBY_INTERNAL_EVENT_NEWOBJ` hook fires after the object has been assigned its klass, flags, and (sometimes) shape_id. In pseudocode the current `newobj_of` is: ```c VALUE rb_gc_impl_new_obj(..., klass, flags, ...) { obj = freelist_pop_or_alloc(size); obj->flags = flags; obj->klass = klass; return obj; } VALUE newobj_of(klass, flags, shape_id, ...) { VALUE obj = rb_gc_impl_new_obj(..., klass, flags, ...); obj->shape_id = shape_id; if (rb_gc_event_hook_required_p(RUBY_INTERNAL_EVENT_NEWOBJ)) { // hook receives partially initialized object (klass, flags, shape_id, but no other fields) gc_newobj_hook(obj); } return obj; } ``` Instead I would like this to look like the following: ```c VALUE rb_gc_impl_new_obj(...) { VALUE obj = freelist_pop_or_alloc(size); if (rb_gc_event_hook_required_p(RUBY_INTERNAL_EVENT_NEWOBJ)) { // hook receives uninitialized object, a fully 0-initialized T_NONE gc_newobj_hook(obj); } return obj; } VALUE newobj_of(klass, flags, shape_id, ...) { VALUE obj = rb_gc_impl_new_obj(...); obj->flags = flags; obj->shape_id = shape_id; obj->klass = klass; return obj; } ``` Calling the hook with a _partially_ constructed object (klass/shape/flags, but no other attributes set) forces us to include klass/flag/shape_id assignment in the GC API, which prevents several optimizations I'd like to attempt in the next year or so: * Inlining the klass/flags/shape assignment into caller of NEWOBJ macro * Inline allocations in ZJIT (@tekknolagi has asked for the GC to be able to support this) * Eliding reads and modifications to flags/klass/shape, both in C code and ZJIT (should happen automatically from inlining above) * Low-overhead sampling by keeping a freelist or bump pointer of known size https://pypy.org/posts/2025/02/pypy-gc-sampling.html#sampling-approach I want to make this change, however it may cause issues with some allocation profiling gems: * ko1's `allocation_tracer` reads the value_type and klass when called * ddtrace (?) previously tried to use object_id on freshly allocated objects (which was always buggy, crashes reliably in Ruby 4.0), not sure what it does now #21710 * ruby-prof [checks if newly allocated objects are IMEMOs](https://github.com/ruby-prof/ruby-prof/blob/a84326d3b3f248d93b7ab297651cc59c...) (Some previous discussion in #21854). I'm not sure why, it will probably work with this change and simply not skip imemo. Stackprof, Vernier (my profiler), and ObjectSpace.trace_allocations will not have issues as they do not attempt to access the memory of these objects, they only use the address. I'd like to discuss how to make this change. Possible options: * Make the change, gems that try to access flags/klass may crash * Silently disable RUBY_INTERNAL_EVENT_NEWOBJ and introduce a new event or API. This would avoid any crashes, but existing gems would not work * Skip the C optimizations. Disable ZJIT (with a warning?) when RUBY_INTERNAL_EVENT_NEWOBJ is used * Add a `rb_gc_post_alloc(obj)` call to all call sites after the assignment (slow) -- https://bugs.ruby-lang.org/