
Issue #20009 has been updated by byroot (Jean Boussier). I dug into this bug, and I'm not sure if it's possible to fix it. Classes are serialized this way: ```c case T_CLASS: w_byte(TYPE_CLASS, arg); { VALUE path = class2path(obj); w_bytes(RSTRING_PTR(path), RSTRING_LEN(path), arg); RB_GC_GUARD(path); } break; ``` We write the `TYPE_CLASS` prefix, and then write the bytes of the class name, without any encoding indication. Then on `load`, we just read the bytes and try to lookup the class: ```c case TYPE_CLASS: { VALUE str = r_bytes(arg); v = path2class(str); ``` So on `load` we're looking for `"Cクラス".b.to_sym`, which doesn't match `:"Cクラス"`. To fix this we'd need to include the encoding in the format, but that would mean breaking backward and forward compatibility which is a huge deal. ### Half-way solution Some possible half-way solution would be: - Assume non-ASCII class names are UTF-8 - Raise on dump for class names with non-UTF8 compatible class names. It's far from ideal though. ---------------------------------------- Bug #20009: Marshal.load raises exception when load dumped class include non-ASCII https://bugs.ruby-lang.org/issues/20009#change-105364 * Author: ippachi (Kazuya Hatanaka) * Status: Open * Priority: Normal * ruby -v: ruby 3.2.2 (2023-03-30 revision e51014f9c0) [arm64-darwin22] * Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN ---------------------------------------- ## Reproduction code ```ruby class Cクラス; end Marshal.load(Marshal.dump(Cクラス)) ``` ## Actual result ``` <internal:marshal>:34:in `load': undefined class/module C\xE3\x82\xAF\xE3\x83\xA9\xE3\x82\xB9 (ArgumentError) from marshal.rb:2:in `<main>' ``` ## Expected result Returns `Cクラス` ## Impacted area An exception is raised in Rails under the following conditions * minitest is used with default settings * Parallel execution with parallelize * test class names contain non-ASCII characters The default parallelization uses DRb, and Marshal is used inside DRb. ## Other After trying various things, I thought I could fix it by making `rb_path_to_class` support strings containing non-ASCII characters, but I couldn't find anything more than that. -- https://bugs.ruby-lang.org/