[ruby-core:115549] [Ruby master Bug#20030] `Ripper.tokenize('"\\C-あ"')` separates encoding valid string to encoding invalid string.

Issue #20030 has been reported by tompng (tomoya ishida). ---------------------------------------- Bug #20030: `Ripper.tokenize('"\\C-あ"')` separates encoding valid string to encoding invalid string. https://bugs.ruby-lang.org/issues/20030 * Author: tompng (tomoya ishida) * Status: Open * Priority: Normal * ruby -v: ruby 3.3.0dev (2023-11-30T16:23:25Z master d048bae96b) [x86_64-linux] * Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN ---------------------------------------- ~~~ruby Ripper.tokenize '"\\C-あ"' # or Ripper.tokenize "\"\\C-\u3042\"" # => ["\"", "\x81", "\x82", "\""] ~~~ I expect all tokens to be valid_encoding if the source string is valid_encoding. This is causing IRB crash when typing `"\C-あ"`. -- https://bugs.ruby-lang.org/

Issue #20030 has been updated by nobu (Nobuyoshi Nakada). https://github.com/ruby/ruby/pull/9091 This fixes another ripper scanner event issue at a syntax error. ---------------------------------------- Bug #20030: `Ripper.tokenize('"\\C-あ"')` separates encoding valid string to encoding invalid string. https://bugs.ruby-lang.org/issues/20030#change-105495 * Author: tompng (tomoya ishida) * Status: Open * Priority: Normal * ruby -v: ruby 3.3.0dev (2023-11-30T16:23:25Z master d048bae96b) [x86_64-linux] * Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN ---------------------------------------- ~~~ruby Ripper.tokenize '"\\C-あ"' # or Ripper.tokenize "\"\\C-\u3042\"" # => ["\"", "\x81", "\x82", "\""] ~~~ I expect all tokens to be valid_encoding if the source string is valid_encoding. This is causing IRB crash when typing `"\C-あ"`. -- https://bugs.ruby-lang.org/

Issue #20030 has been updated by nagachika (Tomoyuki Chikanaga). Backport changed from 3.2: REQUIRED to 3.2: DONE ruby_3_2 commit:a804d5514c7c0608b9fb52426ec3ec738420ad29 merged revision(s) commit:d503e1b95a40e45d7767e0175de60092de4ba54e. ---------------------------------------- Bug #20030: `Ripper.tokenize('"\\C-あ"')` separates encoding valid string to encoding invalid string. https://bugs.ruby-lang.org/issues/20030#change-109136 * Author: tompng (tomoya ishida) * Status: Closed * ruby -v: ruby 3.3.0dev (2023-11-30T16:23:25Z master d048bae96b) [x86_64-linux] * Backport: 3.2: DONE ---------------------------------------- ~~~ruby Ripper.tokenize '"\\C-あ"' # or Ripper.tokenize "\"\\C-\u3042\"" # => ["\"", "\x81", "\x82", "\""] ~~~ I expect all tokens to be valid_encoding if the source string is valid_encoding. This is causing IRB crash when typing `"\C-あ"`. -- https://bugs.ruby-lang.org/
participants (3)
-
nagachika (Tomoyuki Chikanaga)
-
nobu (Nobuyoshi Nakada)
-
tompng (tomoya ishida)