[ruby-core:113619] [Ruby master Bug#18738] IRB can't recognize heredoc after words

24 May 2023

      Issue #18738 has been updated by ccmywish (Aoran Zeng).

This has been fixed. Please close it.

----------------------------------------
Bug #18738: IRB can't recognize heredoc after words
https://bugs.ruby-lang.org/issues/18738#change-103263

* Author: ccmywish (Aoran Zeng)
* Status: Open
* Priority: Normal
* ruby -v: ruby 3.1.1p18 (2022-02-18 revision 53f5fc4236) [x86_64-linux]
* Backport: 2.7: UNKNOWN, 3.0: UNKNOWN, 3.1: UNKNOWN
----------------------------------------
My irb_info
```ruby
irb(main):001:0> irb_info
=> 
Ruby version: 3.1.1                            
IRB version: irb 1.4.1 (2021-12-25)            
InputMethod: ReidlineInputMethod with Reline 0.3.1
RUBY_PLATFORM: x86_64-linux                    
LANG env: en_US.UTF-8                          
East Asian Ambiguous Width: 1  
```

See the code below please.

```ruby
a, b = <<EOF, %w[ hello
thank you
ruby devs
EOF
world
]
p a
p b
```
This works well if you save it to a file, and run with `ruby xxx.rb`. The results are here:
```ruby
"thank you\nruby devs\n"
["hello", "world"]
```

But when you type it to `irb`，the code will not end, and you will get:
```ruby
❯ irb
irb(main):001:0] a, b = <<EOF, %w[ hello
irb(main):002:0] thank you
irb(main):003:0] ruby devs
irb(main):004:0] EOF
irb(main):005:0] world
irb(main):006:0" ]
irb(main):007:-" 
irb(main):008:0" 
irb(main):009:0" 
irb(main):010:0" 
```

I found this issue when I read the mruby source code. in mruby, the token after the first line's `hello` should be `tHD_LITERAL_DELIM`. But in CRuby, there's no this token. I tried to dump CRuby's parser state, find that just after reading `<<EOF`，it will directly recognize the whole token 'thank you\nruby\devs'. So, I think this may not be the bug of Ripper, but how IRB called Ripper using its `ruby-lex` line by line.

For your convenience, you can see the parser state.

```
Stack now 0 2 82 341
Entering state 580
Next token is token "string literal" (1.7-1.12: )
Shifting token "string literal" (1.7-1.12: )
Entering state 60
Reducing stack by rule 613 (line 4830):
-> $$ = nterm string_contents (1.12-1.12: )
Stack now 0 2 82 341 580 60
Entering state 301
Reading a token: Next token is token "literal content" (1.12-1.12: "thank you\nruby devs\n")
Shifting token "literal content" (1.12-1.12: "thank you\nruby devs\n")
Entering state 507
Reducing stack by rule 619 (line 4926):
   $1 = token "literal content" (1.12-1.12: "thank you\nruby devs\n")
-> $$ = nterm string_content (1.12-1.12: )
Stack now 0 2 82 341 580 60 301
Entering state 511
Reducing stack by rule 614 (line 4840):
   $1 = nterm string_contents (1.12-1.12: )
   $2 = nterm string_content (1.12-1.12: )
-> $$ = nterm string_contents (1.12-1.12: )
Stack now 0 2 82 341 580 60
Entering state 301
Reading a token: 
lex_state: BEG -> END at line 7453
Next token is token "terminator" (1.12-1.12: )
Shifting token "terminator" (1.12-1.12: )
Entering state 512
Reducing stack by rule 596 (line 4693):
   $1 = token "string literal" (1.7-1.12: )
   $2 = nterm string_contents (1.12-1.12: )
   $3 = token "terminator" (1.12-1.12: )
-> $$ = nterm string1 (1.7-1.12: )
Stack now 0 2 82 341 580
Entering state 109
Reducing stack by rule 594 (line 4683):
   $1 = nterm string1 (1.7-1.12: )
-> $$ = nterm string (1.7-1.12: )
Stack now 0 2 82 341 580
Entering state 108
Reading a token: 
lex_state: END -> BEG|LABEL at line 9814
Next token is token ',' (1.12-1.13: )
```

-- 
https://bugs.ruby-lang.org/

ccmywish (Aoran Zeng)

tags

participants (1)