[ruby-core:120435] [Ruby master Bug#20990] Ripper.tokenize splits `"\C-\あ"` into tokens with invalid byte sequence

Issue #20990 has been reported by tompng (tomoya ishida). ---------------------------------------- Bug #20990: Ripper.tokenize splits `"\C-\あ"` into tokens with invalid byte sequence https://bugs.ruby-lang.org/issues/20990 * Author: tompng (tomoya ishida) * Status: Open * ruby -v: ruby 3.4.1 (2024-12-25 revision 48d4efcb85) +YJIT +MN [arm64-darwin22] * Backport: 3.1: UNKNOWN, 3.2: UNKNOWN, 3.3: UNKNOWN, 3.4: UNKNOWN ---------------------------------------- IRB crashes when a code is tokenized to an invalid byte sequence. ~~~ruby Ripper.tokenize '"\C-\あ"' #=> ["\"", "\\C-\\\xE3\x81", "\x82", "\""] ~~~ I think the error evaluating `"\C-\あ"` should be `Invalid escape character syntax` just like `"\C-あ"` ~~~ $ ./ruby --parser=parse.y -e '"\C-あ"' -e:1: Invalid escape character syntax "\C-あ" $ ./ruby --parser=parse.y -e '"\C-\あ"' -e:1: invalid multibyte char (UTF-8) -e:1: invalid multibyte char (UTF-8) ./ruby: compile error (SyntaxError) ~~~ -- https://bugs.ruby-lang.org/

Issue #20990 has been updated by tompng (tomoya ishida). Pull request: https://github.com/ruby/ruby/pull/12484 ---------------------------------------- Bug #20990: Ripper.tokenize splits `"\C-\あ"` into tokens with invalid byte sequence https://bugs.ruby-lang.org/issues/20990#change-111214 * Author: tompng (tomoya ishida) * Status: Open * ruby -v: ruby 3.4.1 (2024-12-25 revision 48d4efcb85) +YJIT +MN [arm64-darwin22] * Backport: 3.1: UNKNOWN, 3.2: UNKNOWN, 3.3: UNKNOWN, 3.4: UNKNOWN ---------------------------------------- IRB crashes when a code is tokenized to an invalid byte sequence. ~~~ruby Ripper.tokenize '"\C-\あ"' #=> ["\"", "\\C-\\\xE3\x81", "\x82", "\""] ~~~ I think the error evaluating `"\C-\あ"` should be `Invalid escape character syntax` just like `"\C-あ"` ~~~ $ ./ruby --parser=parse.y -e '"\C-あ"' -e:1: Invalid escape character syntax "\C-あ" $ ./ruby --parser=parse.y -e '"\C-\あ"' -e:1: invalid multibyte char (UTF-8) -e:1: invalid multibyte char (UTF-8) ./ruby: compile error (SyntaxError) ~~~ -- https://bugs.ruby-lang.org/
participants (1)
-
tompng (tomoya ishida)