[ruby-core:121597] [Ruby Bug#21257] YJIT can generate infinite loop when OOM

Issue #21257 has been reported by rianmcguire (Rian McGuire). ---------------------------------------- Bug #21257: YJIT can generate infinite loop when OOM https://bugs.ruby-lang.org/issues/21257 * Author: rianmcguire (Rian McGuire) * Status: Open * Backport: 3.2: UNKNOWN, 3.3: UNKNOWN, 3.4: UNKNOWN ---------------------------------------- We've found an edge case where YJIT can generate an infinite loop (jump to the same address) when it's out-of-memory. Reproduction: ```ruby def first second end def second ::File end # Make `second` side exit on its first instruction trace = TracePoint.new(:line) { } trace.enable(target: method(:second)) 32.times do |i| puts i first if i == 29 # We've JITed the methods now - trigger the bug # Trigger a constant cache miss in rb_vm_opt_getconstant_path (in `second`) next time it's called module InvalidateConstantCache File = nil end # nb. this only works in yjit dev mode RubyVM::YJIT.simulate_oom! end end ``` This hangs indefinitely when run with YJIT (`./configure --enable-yjit=dev` is required for simulate_oom). If we attach a debugger to the Ruby process at this point, it's stuck in an infinite loop: ``` $ lldb -p 9753 (lldb) process attach --pid 9753 Process 9753 stopped * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP frame #0: 0x0000000104b202b8 -> 0x104b202b8: b 0x104b202b8 0x104b202bc: nop 0x104b202c0: nop 0x104b202c4: nop Target 0: (ruby) stopped. Executable module set to "/Users/rian/opt/ruby/bin/ruby". Architecture set to: arm64-apple-macosx-. ``` We've reproduced this on: ``` ruby 3.3.6 (2024-11-05 revision 75015d4c1f) [x86_64-linux] ruby 3.4.2 (2025-02-15 revision d2930f8e7a) +PRISM [arm64-darwin23] ruby 3.5.0dev (2025-04-08T06:46:45Z master b68fe530f1) +PRISM [arm64-darwin23] ``` -- https://bugs.ruby-lang.org/

Issue #21257 has been updated by rianmcguire (Rian McGuire). YJIT compiles the `first` and `second` methods to this (on x86_64-linux): ``` # regenerate_branch # Block: first@infinite-jmp.rb:2 (chain_depth: 1) # reg_temps: 00000001 # Insn: 0001 opt_send_without_block (stack_size: 1) # call to Object#second # guard known object with singleton class 0x5571d6436187: movabs rax, 0x7f6ea166c400 0x5571d6436191: cmp rsi, rax 0x5571d6436194: jne 0x5571d6438181 # stack overflow check 0x5571d643619a: lea rax, [rbx + 0x80] 0x5571d64361a1: cmp r13, rax 0x5571d64361a4: jbe 0x5571d64381a1 # store caller sp 0x5571d64361aa: lea rax, [rbx] 0x5571d64361ad: mov qword ptr [r13 + 8], rax # save PC to CFP 0x5571d64361b1: movabs rax, 0x557205c1ce58 0x5571d64361bb: mov qword ptr [r13], rax 0x5571d64361bf: lea rax, [rbx + 0x20] # push cme, specval, frame type 0x5571d64361c3: movabs rcx, 0x7f6e9decba30 0x5571d64361cd: mov qword ptr [rax - 0x18], rcx 0x5571d64361d1: mov qword ptr [rax - 0x10], 0 0x5571d64361d9: mov qword ptr [rax - 8], 0x11110003 # push callee control frame 0x5571d64361e1: mov qword ptr [r13 - 0x30], rax 0x5571d64361e5: movabs rcx, 0x7f6e9decbe50 0x5571d64361ef: mov qword ptr [r13 - 0x28], rcx 0x5571d64361f3: mov qword ptr [r13 - 0x20], rsi 0x5571d64361f7: mov qword ptr [r13 - 0x10], 0 # spill_temps: 00000001 -> 00000000 0x5571d64361ff: mov qword ptr [rbx], rsi 0x5571d6436202: mov rbx, rax 0x5571d6436205: sub rax, 8 0x5571d6436209: mov qword ptr [r13 - 0x18], rax # update cfp->jit_return 0x5571d643620d: movabs rax, 0x5571d64381c5 0x5571d6436217: mov qword ptr [r13 - 8], rax # switch to new CFP 0x5571d643621b: sub r13, 0x38 0x5571d643621f: mov qword ptr [r12 + 0x10], r13 # gen_direct_jmp: fallthrough # Block: second@infinite-jmp.rb:6 # reg_temps: 00000000 # exit to interpreter on trace_opt_getconstant_path 0x5571d6436224: movabs rax, 0x557205c1d580 0x5571d643622e: mov qword ptr [r13], rax 0x5571d6436232: pop rbx 0x5571d6436233: pop r12 0x5571d6436235: pop r13 0x5571d6436237: mov eax, 0x24 0x5571d643623c: ret ``` Notably: 1. the first method is a fallthrough to the second - the branch is BranchGenFn::JumpToTarget0 and BranchShape::Next0, so the branch is effectively empty (see [gen_direct_jmp](https://github.com/ruby/ruby/blob/b68fe530f1880ed314099b61a70e3c0b1ee7cf6d/y...)). 2. the second method exits to the interpreter on its first instruction After the methods have been compiled, the reproduction causes [rb_yjit_constant_ic_update](https://github.com/ruby/ruby/blob/b68fe530f1880ed314099b61a70e3c0b1ee7cf6d/y...) and [invalidate_block_version](https://github.com/ruby/ruby/blob/b68fe530f1880ed314099b61a70e3c0b1ee7cf6d/y...) to be called for the second method, which generates the infinite loop: ``` Invalidating block from second@infinite-jmp.rb:6, ISEQ offsets [0, 0) # gen_direct_jmp: fallthrough # Block: second@infinite-jmp.rb:6 # reg_temps: 00000000 # exit to interpreter on trace_opt_getconstant_path # regenerate_branch 0x5571d6436224: jmp 0x5571d6436224 ``` `invalidate_block_version` skips patching block to jump to block.entry_exit, because it [exits on entry](https://github.com/ruby/ruby/blob/b68fe530f1880ed314099b61a70e3c0b1ee7cf6d/y...) already. It then rewrites the incoming branch from the first method. As we're OOM, gen_branch_stub returns None, and we [fall back to using the invalidated block's exit](https://github.com/ruby/ruby/blob/b68fe530f1880ed314099b61a70e3c0b1ee7cf6d/y...) for the branch target, rather than a new stub. The invalidated block immediately follows the branch, which we detect (target_next is true) and [update the branch shape to BranchShape::Default](https://github.com/ruby/ruby/blob/b68fe530f1880ed314099b61a70e3c0b1ee7cf6d/y...). This means when the branch is regenerated, we [emit a jmp](https://github.com/ruby/ruby/blob/b68fe530f1880ed314099b61a70e3c0b1ee7cf6d/y...) to the block exit address. The original branch code was zero-length fallthrough, so this jmp is written over the start of the invalidated block (this is allowed). However, because that block exits on entry, the jmp target is the start of address of that block and we end up with an infinite loop. It feels like the invalid assumption here is that, if target_next is true, the new branch target will no longer be adjacent. This is normally true, as the target is a newly generated stub, but it falls down if gen_branch_stub failed. ---------------------------------------- Bug #21257: YJIT can generate infinite loop when OOM https://bugs.ruby-lang.org/issues/21257#change-112650 * Author: rianmcguire (Rian McGuire) * Status: Open * Backport: 3.2: UNKNOWN, 3.3: UNKNOWN, 3.4: UNKNOWN ---------------------------------------- We've found an edge case where YJIT can generate an infinite loop (jump to the same address) when it's out-of-memory. Reproduction: ```ruby def first second end def second ::File end # Make `second` side exit on its first instruction trace = TracePoint.new(:line) { } trace.enable(target: method(:second)) 32.times do |i| puts i first if i == 29 # We've JITed the methods now - trigger the bug # Trigger a constant cache miss in rb_vm_opt_getconstant_path (in `second`) next time it's called module InvalidateConstantCache File = nil end # nb. this only works in yjit dev mode RubyVM::YJIT.simulate_oom! end end ``` This hangs indefinitely when run with YJIT (`./configure --enable-yjit=dev` is required for simulate_oom). If we attach a debugger to the Ruby process at this point, it's stuck in an infinite loop: ``` $ lldb -p 9753 (lldb) process attach --pid 9753 Process 9753 stopped * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP frame #0: 0x0000000104b202b8 -> 0x104b202b8: b 0x104b202b8 0x104b202bc: nop 0x104b202c0: nop 0x104b202c4: nop Target 0: (ruby) stopped. Executable module set to "/Users/rian/opt/ruby/bin/ruby". Architecture set to: arm64-apple-macosx-. ``` We've reproduced this on: ``` ruby 3.3.6 (2024-11-05 revision 75015d4c1f) [x86_64-linux] ruby 3.4.2 (2025-02-15 revision d2930f8e7a) +PRISM [arm64-darwin23] ruby 3.5.0dev (2025-04-08T06:46:45Z master b68fe530f1) +PRISM [arm64-darwin23] ``` -- https://bugs.ruby-lang.org/

Issue #21257 has been updated by hsbt (Hiroshi SHIBATA). Status changed from Open to Assigned Assignee set to yjit ---------------------------------------- Bug #21257: YJIT can generate infinite loop when OOM https://bugs.ruby-lang.org/issues/21257#change-112652 * Author: rianmcguire (Rian McGuire) * Status: Assigned * Assignee: yjit * Backport: 3.2: UNKNOWN, 3.3: UNKNOWN, 3.4: UNKNOWN ---------------------------------------- We've found an edge case where YJIT can generate an infinite loop (jump to the same address) when it's out-of-memory. Reproduction: ```ruby def first second end def second ::File end # Make `second` side exit on its first instruction trace = TracePoint.new(:line) { } trace.enable(target: method(:second)) 32.times do |i| puts i first if i == 29 # We've JITed the methods now - trigger the bug # Trigger a constant cache miss in rb_vm_opt_getconstant_path (in `second`) next time it's called module InvalidateConstantCache File = nil end # nb. this only works in yjit dev mode RubyVM::YJIT.simulate_oom! end end ``` This hangs indefinitely when run with YJIT (`./configure --enable-yjit=dev` is required for simulate_oom). If we attach a debugger to the Ruby process at this point, it's stuck in an infinite loop: ``` $ lldb -p 9753 (lldb) process attach --pid 9753 Process 9753 stopped * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP frame #0: 0x0000000104b202b8 -> 0x104b202b8: b 0x104b202b8 0x104b202bc: nop 0x104b202c0: nop 0x104b202c4: nop Target 0: (ruby) stopped. Executable module set to "/Users/rian/opt/ruby/bin/ruby". Architecture set to: arm64-apple-macosx-. ``` We've reproduced this on: ``` ruby 3.3.6 (2024-11-05 revision 75015d4c1f) [x86_64-linux] ruby 3.4.2 (2025-02-15 revision d2930f8e7a) +PRISM [arm64-darwin23] ruby 3.5.0dev (2025-04-08T06:46:45Z master b68fe530f1) +PRISM [arm64-darwin23] ``` -- https://bugs.ruby-lang.org/

Issue #21257 has been updated by rianmcguire (Rian McGuire). I've had a swing at fixing this in https://github.com/ruby/ruby/pull/13186 ---------------------------------------- Bug #21257: YJIT can generate infinite loop when OOM https://bugs.ruby-lang.org/issues/21257#change-112797 * Author: rianmcguire (Rian McGuire) * Status: Assigned * Assignee: yjit * Backport: 3.2: UNKNOWN, 3.3: UNKNOWN, 3.4: UNKNOWN ---------------------------------------- We've found an edge case where YJIT can generate an infinite loop (jump to the same address) when it's out-of-memory. Reproduction: ```ruby def first second end def second ::File end # Make `second` side exit on its first instruction trace = TracePoint.new(:line) { } trace.enable(target: method(:second)) 32.times do |i| puts i first if i == 29 # We've JITed the methods now - trigger the bug # Trigger a constant cache miss in rb_vm_opt_getconstant_path (in `second`) next time it's called module InvalidateConstantCache File = nil end # nb. this only works in yjit dev mode RubyVM::YJIT.simulate_oom! end end ``` This hangs indefinitely when run with YJIT (`./configure --enable-yjit=dev` is required for simulate_oom). If we attach a debugger to the Ruby process at this point, it's stuck in an infinite loop: ``` $ lldb -p 9753 (lldb) process attach --pid 9753 Process 9753 stopped * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP frame #0: 0x0000000104b202b8 -> 0x104b202b8: b 0x104b202b8 0x104b202bc: nop 0x104b202c0: nop 0x104b202c4: nop Target 0: (ruby) stopped. Executable module set to "/Users/rian/opt/ruby/bin/ruby". Architecture set to: arm64-apple-macosx-. ``` We've reproduced this on: ``` ruby 3.3.6 (2024-11-05 revision 75015d4c1f) [x86_64-linux] ruby 3.4.2 (2025-02-15 revision d2930f8e7a) +PRISM [arm64-darwin23] ruby 3.5.0dev (2025-04-08T06:46:45Z master b68fe530f1) +PRISM [arm64-darwin23] ``` -- https://bugs.ruby-lang.org/

Issue #21257 has been updated by k0kubun (Takashi Kokubun). Backport changed from 3.2: DONTNEED, 3.3: REQUIRED, 3.4: REQUIRED to 3.2: DONTNEED, 3.3: REQUIRED, 3.4: DONE ruby_3_4 commit:50b1759be00713535c41f5650feb3967c533450a. ---------------------------------------- Bug #21257: YJIT can generate infinite loop when OOM https://bugs.ruby-lang.org/issues/21257#change-113228 * Author: rianmcguire (Rian McGuire) * Status: Closed * Assignee: jit * Backport: 3.2: DONTNEED, 3.3: REQUIRED, 3.4: DONE ---------------------------------------- We've found an edge case where YJIT can generate an infinite loop (jump to the same address) when it's out-of-memory. Reproduction: ```ruby def first second end def second ::File end # Make `second` side exit on its first instruction trace = TracePoint.new(:line) { } trace.enable(target: method(:second)) 32.times do |i| puts i first if i == 29 # We've JITed the methods now - trigger the bug # Trigger a constant cache miss in rb_vm_opt_getconstant_path (in `second`) next time it's called module InvalidateConstantCache File = nil end # nb. this only works in yjit dev mode RubyVM::YJIT.simulate_oom! end end ``` This hangs indefinitely when run with YJIT (`./configure --enable-yjit=dev` is required for simulate_oom). If we attach a debugger to the Ruby process at this point, it's stuck in an infinite loop: ``` $ lldb -p 9753 (lldb) process attach --pid 9753 Process 9753 stopped * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP frame #0: 0x0000000104b202b8 -> 0x104b202b8: b 0x104b202b8 0x104b202bc: nop 0x104b202c0: nop 0x104b202c4: nop Target 0: (ruby) stopped. Executable module set to "/Users/rian/opt/ruby/bin/ruby". Architecture set to: arm64-apple-macosx-. ``` We've reproduced this on: ``` ruby 3.3.6 (2024-11-05 revision 75015d4c1f) [x86_64-linux] ruby 3.4.2 (2025-02-15 revision d2930f8e7a) +PRISM [arm64-darwin23] ruby 3.5.0dev (2025-04-08T06:46:45Z master b68fe530f1) +PRISM [arm64-darwin23] ``` -- https://bugs.ruby-lang.org/

Issue #21257 has been updated by nagachika (Tomoyuki Chikanaga). Backport changed from 3.2: DONTNEED, 3.3: REQUIRED, 3.4: DONE to 3.2: DONTNEED, 3.3: DONE, 3.4: DONE ruby_3_3 commit:f57dd4470b9ba1e2e0007e814f94e8bb4fd2ab6f merged revision(s) commit:80a1a1bb8ae8435b916ae4f66a483e91ad31356a. ---------------------------------------- Bug #21257: YJIT can generate infinite loop when OOM https://bugs.ruby-lang.org/issues/21257#change-113330 * Author: rianmcguire (Rian McGuire) * Status: Closed * Assignee: jit * Backport: 3.2: DONTNEED, 3.3: DONE, 3.4: DONE ---------------------------------------- We've found an edge case where YJIT can generate an infinite loop (jump to the same address) when it's out-of-memory. Reproduction: ```ruby def first second end def second ::File end # Make `second` side exit on its first instruction trace = TracePoint.new(:line) { } trace.enable(target: method(:second)) 32.times do |i| puts i first if i == 29 # We've JITed the methods now - trigger the bug # Trigger a constant cache miss in rb_vm_opt_getconstant_path (in `second`) next time it's called module InvalidateConstantCache File = nil end # nb. this only works in yjit dev mode RubyVM::YJIT.simulate_oom! end end ``` This hangs indefinitely when run with YJIT (`./configure --enable-yjit=dev` is required for simulate_oom). If we attach a debugger to the Ruby process at this point, it's stuck in an infinite loop: ``` $ lldb -p 9753 (lldb) process attach --pid 9753 Process 9753 stopped * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP frame #0: 0x0000000104b202b8 -> 0x104b202b8: b 0x104b202b8 0x104b202bc: nop 0x104b202c0: nop 0x104b202c4: nop Target 0: (ruby) stopped. Executable module set to "/Users/rian/opt/ruby/bin/ruby". Architecture set to: arm64-apple-macosx-. ``` We've reproduced this on: ``` ruby 3.3.6 (2024-11-05 revision 75015d4c1f) [x86_64-linux] ruby 3.4.2 (2025-02-15 revision d2930f8e7a) +PRISM [arm64-darwin23] ruby 3.5.0dev (2025-04-08T06:46:45Z master b68fe530f1) +PRISM [arm64-darwin23] ``` -- https://bugs.ruby-lang.org/
participants (4)
-
hsbt (Hiroshi SHIBATA)
-
k0kubun (Takashi Kokubun)
-
nagachika (Tomoyuki Chikanaga)
-
rianmcguire (Rian McGuire)