[ruby-core:113059] [Ruby master Bug#19563] Ripper.tokenize(code).join != code when heredoc and multiline %w[] literal is on the same line

Issue #19563 has been reported by tompng (tomoya ishida). ---------------------------------------- Bug #19563: Ripper.tokenize(code).join != code when heredoc and multiline %w[] literal is on the same line https://bugs.ruby-lang.org/issues/19563 * Author: tompng (tomoya ishida) * Status: Open * Priority: Normal * ruby -v: ruby 3.3.0dev (2023-03-29T21:57:52Z master 1b06422767) [x86_64-linux] * Backport: 2.7: UNKNOWN, 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN ---------------------------------------- ~~~ruby Ripper.tokenize "<<EOF || %w[hello\nEOF\n\n\n\nworld]" # actual result ["<<EOF", " ", "||", " ", "%w[", "hello", "\n\n\n\n", "EOF\n", "world", "]"] # expected result ["<<EOF", " ", "||", " ", "%w[", "hello", "\n", "EOF\n", "\n\n\n", "world", "]"] ~~~ same for `%i[]` literal. -- https://bugs.ruby-lang.org/

Issue #19563 has been updated by nobu (Nobuyoshi Nakada). Similar for Unicode codepoints. ```ruby p Ripper.tokenize("<<EOS || %[\\u{4a\n""EOS\n\n\n\n""5a}]").join("") # "<<EOS || %[EOS\n\n\n\n5a}]" ``` ---------------------------------------- Bug #19563: Ripper.tokenize(code).join != code when heredoc and multiline %w[] literal is on the same line https://bugs.ruby-lang.org/issues/19563#change-102603 * Author: tompng (tomoya ishida) * Status: Open * Priority: Normal * ruby -v: ruby 3.3.0dev (2023-03-29T21:57:52Z master 1b06422767) [x86_64-linux] * Backport: 2.7: UNKNOWN, 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN ---------------------------------------- ~~~ruby Ripper.tokenize "<<EOF || %w[hello\nEOF\n\n\n\nworld]" # actual result ["<<EOF", " ", "||", " ", "%w[", "hello", "\n\n\n\n", "EOF\n", "world", "]"] # expected result ["<<EOF", " ", "||", " ", "%w[", "hello", "\n", "EOF\n", "\n\n\n", "world", "]"] ~~~ same for `%i[]` literal. -- https://bugs.ruby-lang.org/

Issue #19563 has been updated by nobu (Nobuyoshi Nakada). File 0001-Bug-19563-Yield-words-separators-per-lines.patch added The attached patch fails in an IRB test. I'm not sure about this prompt transition, should this be fixed as expected? As the line 004 ends the heredoc `C` but is inside `%W[`, isn't `]` ok here? ``` [25/39] TestIRB::TestRubyLex#test_heredoc_with_embexpr = 0.00 s 1) Failure: TestIRB::TestRubyLex#test_heredoc_with_embexpr [/Users/nobu/src/ruby/master/src/test/irb/test_ruby_lex.rb:246]: Expected dynamic prompt: 001:0:":* 002:0:":* 003:0:":* 004:0:":* 005:0:]:* 006:0:":* 007:0:":* 008:0:":* 009:0:]:* 010:0:]:* 011:0: :> 012:0: :* Actual dynamic prompt: 001:0:":* 002:0:":* 003:0:":* 004:0:]:* 005:0:]:* 006:0:":* 007:0:":* 008:0:":* 009:0:]:* 010:0:]:* 011:0: :> 012:0: :* .. <["001:0:\":* ", "002:0:\":* ", "003:0:\":* ", "004:0:\":* ", "005:0:]:* ", "006:0:\":* ", "007:0:\":* ", "008:0:\":* ", "009:0:]:* ", "010:0:]:* ", "011:0: :> ", "012:0: :* "]> expected but was <["001:0:\":* ", "002:0:\":* ", "003:0:\":* ", "004:0:]:* ", "005:0:]:* ", "006:0:\":* ", "007:0:\":* ", "008:0:\":* ", "009:0:]:* ", "010:0:]:* ", "011:0: :> ", "012:0: :* "]>. ``` ---------------------------------------- Bug #19563: Ripper.tokenize(code).join != code when heredoc and multiline %w[] literal is on the same line https://bugs.ruby-lang.org/issues/19563#change-102604 * Author: tompng (tomoya ishida) * Status: Open * Priority: Normal * ruby -v: ruby 3.3.0dev (2023-03-29T21:57:52Z master 1b06422767) [x86_64-linux] * Backport: 2.7: UNKNOWN, 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN ---------------------------------------- ~~~ruby Ripper.tokenize "<<EOF || %w[hello\nEOF\n\n\n\nworld]" # actual result ["<<EOF", " ", "||", " ", "%w[", "hello", "\n\n\n\n", "EOF\n", "world", "]"] # expected result ["<<EOF", " ", "||", " ", "%w[", "hello", "\n", "EOF\n", "\n\n\n", "world", "]"] ~~~ same for `%i[]` literal. ---Files-------------------------------- 0001-Bug-19563-Yield-words-separators-per-lines.patch (4.35 KB) -- https://bugs.ruby-lang.org/

Issue #19563 has been updated by tompng (tomoya ishida). I've opend a pull request to fix irb's test. Changes expected prompt and target code to avoid this bug in ruby 2.7~3.2. https://github.com/ruby/irb/pull/558 ---------------------------------------- Bug #19563: Ripper.tokenize(code).join != code when heredoc and multiline %w[] literal is on the same line https://bugs.ruby-lang.org/issues/19563#change-102688 * Author: tompng (tomoya ishida) * Status: Open * Priority: Normal * ruby -v: ruby 3.3.0dev (2023-03-29T21:57:52Z master 1b06422767) [x86_64-linux] * Backport: 2.7: UNKNOWN, 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN ---------------------------------------- ~~~ruby Ripper.tokenize "<<EOF || %w[hello\nEOF\n\n\n\nworld]" # actual result ["<<EOF", " ", "||", " ", "%w[", "hello", "\n\n\n\n", "EOF\n", "world", "]"] # expected result ["<<EOF", " ", "||", " ", "%w[", "hello", "\n", "EOF\n", "\n\n\n", "world", "]"] ~~~ same for `%i[]` literal. ---Files-------------------------------- 0001-Bug-19563-Yield-words-separators-per-lines.patch (4.35 KB) -- https://bugs.ruby-lang.org/

Issue #19563 has been updated by tompng (tomoya ishida). I merged irb's pull request. I think the failing test is fixed now. ---------------------------------------- Bug #19563: Ripper.tokenize(code).join != code when heredoc and multiline %w[] literal is on the same line https://bugs.ruby-lang.org/issues/19563#change-102698 * Author: tompng (tomoya ishida) * Status: Open * Priority: Normal * ruby -v: ruby 3.3.0dev (2023-03-29T21:57:52Z master 1b06422767) [x86_64-linux] * Backport: 2.7: UNKNOWN, 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN ---------------------------------------- ~~~ruby Ripper.tokenize "<<EOF || %w[hello\nEOF\n\n\n\nworld]" # actual result ["<<EOF", " ", "||", " ", "%w[", "hello", "\n\n\n\n", "EOF\n", "world", "]"] # expected result ["<<EOF", " ", "||", " ", "%w[", "hello", "\n", "EOF\n", "\n\n\n", "world", "]"] ~~~ same for `%i[]` literal. ---Files-------------------------------- 0001-Bug-19563-Yield-words-separators-per-lines.patch (4.35 KB) -- https://bugs.ruby-lang.org/

Issue #19563 has been updated by nagachika (Tomoyuki Chikanaga). MEMO: 4af9bd52cbb8cff7d149a8565012ab1153a4b5b1 is the follow-up commit for ac8a16237c727ae2a1446ef6dc810d0e750971fb. ---------------------------------------- Bug #19563: Ripper.tokenize(code).join != code when heredoc and multiline %w[] literal is on the same line https://bugs.ruby-lang.org/issues/19563#change-102727 * Author: tompng (tomoya ishida) * Status: Closed * Priority: Normal * ruby -v: ruby 3.3.0dev (2023-03-29T21:57:52Z master 1b06422767) [x86_64-linux] * Backport: 2.7: UNKNOWN, 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN ---------------------------------------- ~~~ruby Ripper.tokenize "<<EOF || %w[hello\nEOF\n\n\n\nworld]" # actual result ["<<EOF", " ", "||", " ", "%w[", "hello", "\n\n\n\n", "EOF\n", "world", "]"] # expected result ["<<EOF", " ", "||", " ", "%w[", "hello", "\n", "EOF\n", "\n\n\n", "world", "]"] ~~~ same for `%i[]` literal. ---Files-------------------------------- 0001-Bug-19563-Yield-words-separators-per-lines.patch (4.35 KB) -- https://bugs.ruby-lang.org/
participants (3)
-
nagachika (Tomoyuki Chikanaga)
-
nobu (Nobuyoshi Nakada)
-
tompng (tomoya ishida)