[ruby-core:113715] [Ruby master Bug#19701] The rb_classext_t::classpath field is not marked for T_ICLASS

Issue #19701 has been reported by wks (Kunshan Wang). ---------------------------------------- Bug #19701: The rb_classext_t::classpath field is not marked for T_ICLASS https://bugs.ruby-lang.org/issues/19701 * Author: wks (Kunshan Wang) * Status: Open * Priority: Normal * Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN ---------------------------------------- I am hacking Ruby to dump information about some objects, and I found that the `rb_classext_t::classpath` field for `T_ICLASS` objects sometimes contains dangling references to dead objects. The commit https://github.com/ruby/ruby/commit/081cc4eb283cb01ddffb364397e5175dbfacab66 set the `classpath` field of `rb_mRubyVMFrozenCore` to a string "FrozenCore" so that it can be dumped using the `rb_dump_literal` function. However, in `gc_mark_children`, if the `obj` is a `T_ICLASS`, the `RCLASS_EXT(obj)->classpath` will not be marked. As a result, if `rb_mRubyVMFrozenCore` is the only object that holds a reference to the string "FrozenCore", the string will be considered garbage and reclaimed during a GC, and the `classpath` will contain a dangling pointer. There are two solutions to this problem. We can take *one* of the approaches below (not both). 1. Let the GC mark the `classpath` field. I drafted a pull request here: https://github.com/ruby/ruby/pull/7875 2. Revert the commit https://github.com/ruby/ruby/commit/081cc4eb283cb01ddffb364397e5175dbfacab66 Marking the `classpath` field in GC will keep the `rb_dump_literal` function working. For debug purposes, we can also use that field to identify what object a given `T_ICLASS` is. Adding one marked field may make GC slower, but I don't think it will be observable because there are far less `T_ICLASS` objects than ordinary objects. If we reverting the commit above, the `classpath` will always be blank for all `T_ICLASS` objects. (Question: How do we enforce it?) It will also save some memory by keeping less strings alive. However, currently, "FrozenCore" seems to be the only `T_OBJECT` that has its classpath set, and it may not result in significant memory saving. I don't know what purpose the `rb_dump_literal` function originally served. Maybe it is still important. Maybe it is safe to remove now. Which of the two approaches should we take? It looks like each of them has its pros and cons. -- https://bugs.ruby-lang.org/

Issue #19701 has been updated by peterzhu2118 (Peter Zhu). I prefer option 2 as most (if not all other) ICLASS do not have a classpath so we shouldn't be paying the performance penalty of marking the field for all ICLASS, it would be a waste of computational effort to make only `FrozenCore` easier to debug. Additionally, we can optimize bootup to have one less object allocation. ---------------------------------------- Bug #19701: The rb_classext_t::classpath field is not marked for T_ICLASS https://bugs.ruby-lang.org/issues/19701#change-103367 * Author: wks (Kunshan Wang) * Status: Open * Priority: Normal * Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN ---------------------------------------- I am hacking Ruby to dump information about some objects, and I found that the `rb_classext_t::classpath` field for `T_ICLASS` objects sometimes contains dangling references to dead objects. The commit https://github.com/ruby/ruby/commit/081cc4eb283cb01ddffb364397e5175dbfacab66 set the `classpath` field of `rb_mRubyVMFrozenCore` to a string "FrozenCore" so that it can be dumped using the `rb_dump_literal` function. However, in `gc_mark_children`, if the `obj` is a `T_ICLASS`, the `RCLASS_EXT(obj)->classpath` will not be marked. As a result, if `rb_mRubyVMFrozenCore` is the only object that holds a reference to the string "FrozenCore", the string will be considered garbage and reclaimed during a GC, and the `classpath` will contain a dangling pointer. There are two solutions to this problem. We can take *one* of the approaches below (not both). 1. Let the GC mark the `classpath` field. I drafted a pull request here: https://github.com/ruby/ruby/pull/7875 2. Revert the commit https://github.com/ruby/ruby/commit/081cc4eb283cb01ddffb364397e5175dbfacab66 Marking the `classpath` field in GC will keep the `rb_dump_literal` function working. For debug purposes, we can also use that field to identify what object a given `T_ICLASS` is. Adding one marked field may make GC slower, but I don't think it will be observable because there are far less `T_ICLASS` objects than ordinary objects. If we reverting the commit above, the `classpath` will always be blank for all `T_ICLASS` objects. (Question: How do we enforce it?) It will also save some memory by keeping less strings alive. However, currently, "FrozenCore" seems to be the only `T_OBJECT` that has its classpath set, and it may not result in significant memory saving. I don't know what purpose the `rb_dump_literal` function originally served. Maybe it is still important. Maybe it is safe to remove now. Which of the two approaches should we take? It looks like each of them has its pros and cons. -- https://bugs.ruby-lang.org/

Issue #19701 has been updated by jeremyevans0 (Jeremy Evans). There is a third option, set "FrozenCore" as an fstring that doesn't get garbage collected (via `rb_gc_register_address` or something). That reduces the cost to 1 object marking per major GC. That seems to be the best option to me if the commit shouldn't be reverted. As to whether the commit should be reverted, hopefully @nobu can answer that. @wks can you provide example code that crashes Ruby with the current implementation? It would be useful when committing a test/spec that fixes this issue. ---------------------------------------- Bug #19701: The rb_classext_t::classpath field is not marked for T_ICLASS https://bugs.ruby-lang.org/issues/19701#change-103523 * Author: wks (Kunshan Wang) * Status: Open * Priority: Normal * Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN ---------------------------------------- I am hacking Ruby to dump information about some objects, and I found that the `rb_classext_t::classpath` field for `T_ICLASS` objects sometimes contains dangling references to dead objects. The commit https://github.com/ruby/ruby/commit/081cc4eb283cb01ddffb364397e5175dbfacab66 set the `classpath` field of `rb_mRubyVMFrozenCore` to a string "FrozenCore" so that it can be dumped using the `rb_dump_literal` function. However, in `gc_mark_children`, if the `obj` is a `T_ICLASS`, the `RCLASS_EXT(obj)->classpath` will not be marked. As a result, if `rb_mRubyVMFrozenCore` is the only object that holds a reference to the string "FrozenCore", the string will be considered garbage and reclaimed during a GC, and the `classpath` will contain a dangling pointer. There are two solutions to this problem. We can take *one* of the approaches below (not both). 1. Let the GC mark the `classpath` field. I drafted a pull request here: https://github.com/ruby/ruby/pull/7875 2. Revert the commit https://github.com/ruby/ruby/commit/081cc4eb283cb01ddffb364397e5175dbfacab66 Marking the `classpath` field in GC will keep the `rb_dump_literal` function working. For debug purposes, we can also use that field to identify what object a given `T_ICLASS` is. Adding one marked field may make GC slower, but I don't think it will be observable because there are far less `T_ICLASS` objects than ordinary objects. If we reverting the commit above, the `classpath` will always be blank for all `T_ICLASS` objects. (Question: How do we enforce it?) It will also save some memory by keeping less strings alive. However, currently, "FrozenCore" seems to be the only `T_OBJECT` that has its classpath set, and it may not result in significant memory saving. I don't know what purpose the `rb_dump_literal` function originally served. Maybe it is still important. Maybe it is safe to remove now. Which of the two approaches should we take? It looks like each of them has its pros and cons. -- https://bugs.ruby-lang.org/

Issue #19701 has been updated by wks (Kunshan Wang). jeremyevans0 (Jeremy Evans) wrote in #note-2:
There is a third option, set "FrozenCore" as an fstring that doesn't get garbage collected (via `rb_gc_register_address` or something). That reduces the cost to 1 object marking per major GC. That seems to be the best option to me if the commit shouldn't be reverted. As to whether the commit should be reverted, hopefully @nobu can answer that.
@wks can you provide example code that crashes Ruby with the current implementation? It would be useful when committing a test/spec that fixes this issue.
You can try this patch: ```diff diff --git a/gc.c b/gc.c index 6bbfa4e5ef..5de8d0b446 100644 --- a/gc.c +++ b/gc.c @@ -7201,6 +7201,14 @@ gc_mark_children(rb_objspace_t *objspace, VALUE obj) break; case T_ICLASS: + { + VALUE classpath = rb_class_path_cached(obj); + printf("classpath: %p\n", (void*)classpath); + if (RTEST(classpath)) { + puts(rb_str_to_cstr(classpath)); + } + } + if (RICLASS_OWNS_M_TBL_P(obj)) { mark_m_tbl(objspace, RCLASS_M_TBL(obj)); } ``` Then run a hello world program `./miniruby -e 'puts "Hello"'`. It should crash in the second GC when it tries to print info about `FrozenCore`. ``` $ ./miniruby -e 'puts "Hello"' classpath: 0x4 classpath: 0x4 classpath: 0x4 classpath: 0x4 classpath: 0x4 classpath: 0x4 classpath: 0x4 classpath: 0x4 classpath: 0x4 classpath: 0x4 classpath: 0x4 classpath: 0x4 classpath: 0x4 classpath: 0x4 classpath: 0x4 classpath: 0x4 classpath: 0x4 classpath: 0x4 classpath: 0x4 classpath: 0x4 classpath: 0x4 classpath: 0x4 classpath: 0x7fc26454b860 RubyVM::FrozenCore classpath: 0x4 classpath: 0x4 classpath: 0x4 classpath: 0x4 classpath: 0x4 classpath: 0x4 classpath: 0x4 classpath: 0x4 classpath: 0x4 classpath: 0x4 classpath: 0x4 classpath: 0x4 classpath: 0x4 classpath: 0x4 classpath: 0x4 classpath: 0x4 classpath: 0x4 classpath: 0x4 classpath: 0x4 classpath: 0x4 classpath: 0x4 classpath: 0x4 classpath: 0x4 classpath: 0x4 classpath: 0x4 classpath: 0x4 classpath: 0x7fc26454b860 ./miniruby: [BUG] Segmentation fault at 0x0000000000000014 ruby 3.3.0dev (2023-06-13T03:28:33Z master 3924dba552) [x86_64-linux] ... ``` I think it is a nice workaround to keep the string `"RubyVM::FrozenCore"` alive and not moved by holding it with a pinning root. CRuby needs the capability of object pinning anyway due to conservative stack scanning. But it is not a beautiful and complete solution because it makes a special case to the object layout. ---------------------------------------- Bug #19701: The rb_classext_t::classpath field is not marked for T_ICLASS https://bugs.ruby-lang.org/issues/19701#change-103583 * Author: wks (Kunshan Wang) * Status: Open * Priority: Normal * Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN ---------------------------------------- I am hacking Ruby to dump information about some objects, and I found that the `rb_classext_t::classpath` field for `T_ICLASS` objects sometimes contains dangling references to dead objects. The commit https://github.com/ruby/ruby/commit/081cc4eb283cb01ddffb364397e5175dbfacab66 set the `classpath` field of `rb_mRubyVMFrozenCore` to a string "FrozenCore" so that it can be dumped using the `rb_dump_literal` function. However, in `gc_mark_children`, if the `obj` is a `T_ICLASS`, the `RCLASS_EXT(obj)->classpath` will not be marked. As a result, if `rb_mRubyVMFrozenCore` is the only object that holds a reference to the string "FrozenCore", the string will be considered garbage and reclaimed during a GC, and the `classpath` will contain a dangling pointer. There are two solutions to this problem. We can take *one* of the approaches below (not both). 1. Let the GC mark the `classpath` field. I drafted a pull request here: https://github.com/ruby/ruby/pull/7875 2. Revert the commit https://github.com/ruby/ruby/commit/081cc4eb283cb01ddffb364397e5175dbfacab66 Marking the `classpath` field in GC will keep the `rb_dump_literal` function working. For debug purposes, we can also use that field to identify what object a given `T_ICLASS` is. Adding one marked field may make GC slower, but I don't think it will be observable because there are far less `T_ICLASS` objects than ordinary objects. If we reverting the commit above, the `classpath` will always be blank for all `T_ICLASS` objects. (Question: How do we enforce it?) It will also save some memory by keeping less strings alive. However, currently, "FrozenCore" seems to be the only `T_OBJECT` that has its classpath set, and it may not result in significant memory saving. I don't know what purpose the `rb_dump_literal` function originally served. Maybe it is still important. Maybe it is safe to remove now. Which of the two approaches should we take? It looks like each of them has its pros and cons. -- https://bugs.ruby-lang.org/
participants (4)
-
jeremyevans0 (Jeremy Evans)
-
peterzhu2118 (Peter Zhu)
-
wks (Kunshan Wang)
-
wks (Kunshan Wang)