Issue #21993 has been updated by luke-gru (Luke Gruber). Nice catch! I think this would only be a problem if the key was a number that was also a valid pointer into the Ruby heap and it happened to point to a T_MOVED object. But it should definitely be fixed. ---------------------------------------- Bug #21993: `rb_gc_update_tbl_refs` is incorrectly documented as the dcompact pair for `rb_mark_tbl_no_pin`, and is unsafe for `st_table`s with non-VALUE keys https://bugs.ruby-lang.org/issues/21993#change-117043 * Author: ivoanjo (Ivo Anjo) * Status: Open * Backport: 3.2: UNKNOWN, 3.3: UNKNOWN, 3.4: UNKNOWN, 4.0: UNKNOWN ---------------------------------------- Hey! I work for Datadog on the [Ruby profiler](https://github.com/datadog/dd-trace-rb) and I've been exploring the TypedData GC API to understand the correct patterns for writing compaction-aware extensions. While reading through the API, I noticed a mismatch between the documented pairing of `rb_mark_tbl_no_pin` + `rb_gc_update_tbl_refs` and what the implementation actually does that seems like it can cause crashes and even corruption for extensions. ### The problem The public header documents `rb_gc_update_tbl_refs` as the `dcompact` counterpart for `rb_mark_tbl_no_pin`: ``` * Updates references inside of tables. After you marked values using * rb_mark_tbl_no_pin(), the objects inside of the table could of course be * moved. This function is to fixup those references. ``` However, `rb_mark_tbl_no_pin`'s implementation only marks **values**, not keys: ```c // gc.c / gc/gc.h static int gc_mark_tbl_no_pin_i(st_data_t key, st_data_t value, st_data_t data) { rb_gc_mark_movable((VALUE)value); // key is never touched return ST_CONTINUE; } ``` Thus, tables used with `rb_mark_tbl_no_pin` can have non-VALUE keys (e.g. `ID`s, raw integers, C pointers, etc). The problem is that `rb_gc_update_tbl_refs` (via `gc_update_table_refs`) calls `rb_gc_location` on **both** keys and values: ```c static int hash_foreach_replace(st_data_t key, st_data_t value, ...) { if (rb_gc_location((VALUE)key) != (VALUE)key) // <- called on key return ST_REPLACE; if (rb_gc_location((VALUE)value) != (VALUE)value) return ST_REPLACE; ... } ``` Calling `rb_gc_location` on something that's not a `VALUE` and even sometimes replacing it is probably going to cause crashes and heap corruption. ### Suggested fixes I can think of a few ways to address this: **a) Deprecate `rb_gc_update_tbl_refs`** Maybe just deprecate it? I'm not even sure if this gets used very often. **b) Introduce `rb_mark_hash_no_pin`** The `mark_keyvalue` iterator already exists internally and is used by `mark_hash()` for Ruby's own `Hash` objects. Exposing a public `rb_mark_hash_no_pin` would give `rb_gc_update_tbl_refs` a safe, correct `dmark` counterpart for tables where both keys and values are Ruby objects. **c) Fix the docs and document the precondition for `rb_gc_update_tbl_refs`** Update the docs to make explicit that `rb_gc_update_tbl_refs` assumes a `VALUE => VALUE` table, and that all keys **must have been marked** (e.g. via `rb_mark_set` or a manual iteration with `rb_gc_mark_movable`) in addition to calling `rb_mark_tbl_no_pin` for the values. Weird and ugly, but correct? --- Happy to help with a patch if useful. -- https://bugs.ruby-lang.org/