
Issue #20009 has been updated by nobu (Nobuyoshi Nakada). Eregon (Benoit Daloze) wrote in #note-7:
So my suggestion would be: * Interpret the serialized module/class name as UTF-8, not as BINARY. And of course if it's only 7-bit as US-ASCII (already the case). * If the module/class name uses another encoding, we could either transcode it to UTF-8, or use nobu's trick of serializing the encoding name as a fake instance variable. Transcoding to UTF-8 seems simpler, and the only case it wouldn't work is for BINARY with non-7-bit, which seems of no value anyway. EDIT: mmh but I suppose transcoding to UTF-8 wouldn't work on CRuby because the lookup would fail.
I'm not against to interpret the default encoding as UTF-8, but don't think transcoding is intuitive as different encoding symbols are different even if they are same when transcoded. ---------------------------------------- Bug #20009: Marshal.load raises exception when load dumped class include non-ASCII https://bugs.ruby-lang.org/issues/20009#change-113318 * Author: ippachi (Kazuya Hatanaka) * Status: Open * ruby -v: ruby 3.2.2 (2023-03-30 revision e51014f9c0) [arm64-darwin22] * Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN ---------------------------------------- ## Reproduction code ```ruby class Cクラス; end Marshal.load(Marshal.dump(Cクラス)) ``` ## Actual result ``` <internal:marshal>:34:in `load': undefined class/module C\xE3\x82\xAF\xE3\x83\xA9\xE3\x82\xB9 (ArgumentError) from marshal.rb:2:in `<main>' ``` ## Expected result Returns `Cクラス` ## Impacted area An exception is raised in Rails under the following conditions * minitest is used with default settings * Parallel execution with parallelize * test class names contain non-ASCII characters The default parallelization uses DRb, and Marshal is used inside DRb. ## Other After trying various things, I thought I could fix it by making `rb_path_to_class` support strings containing non-ASCII characters, but I couldn't find anything more than that. -- https://bugs.ruby-lang.org/