[ruby-core:125546] [Ruby Bug#22075] heap-use-after-free in `rb_vm_ci_lookup` under parallel Ractors
Issue #22075 has been reported by zonuexe (Kenta USAMI). ---------------------------------------- Bug #22075: heap-use-after-free in `rb_vm_ci_lookup` under parallel Ractors https://bugs.ruby-lang.org/issues/22075 * Author: zonuexe (Kenta USAMI) * Status: Open * ruby -v: 4.0.5 * Backport: 3.3: UNKNOWN, 3.4: UNKNOWN, 4.0: UNKNOWN ---------------------------------------- ## Environment - `ruby 4.0.5 (2026-05-20 revision 64336ffd0e) +PRISM [aarch64-linux]` - Also reproduced on `ruby 4.0.4 ... [x86_64-linux]`. ## Summary Under a parallel Ractor pool, a GC sweep on one Ractor frees a `T_DATA` object while `rb_vm_ci_lookup` on another Ractor reads that freed heap memory. Pure-Ruby code triggers it; the VM crashes (SIGSEGV). ## Reproduction No minimal script isolated yet. Reproduced with the `rigor` project at tag `v0.1.7` (https://github.com/rigortype/rigor/tree/v0.1.7). 1. Build Ruby + native gems with AddressSanitizer: ``` ./configure --disable-yjit \ cflags="-fsanitize=address -fno-sanitize-address-use-after-scope -g -O1" \ cppflags="-fsanitize=address" LDFLAGS="-fsanitize=address" ``` 2. Run a spec that drives a 2-worker Ractor pool: ``` ASAN_OPTIONS="detect_leaks=0 detect_stack_use_after_return=0 halt_on_error=0" \ RIGOR_INCLUDE_RACTOR_POOL=1 \ bundle exec rspec spec/rigor/analysis/runner_pool_spec.rb ``` CI failure: https://github.com/rigortype/rigor/actions/runs/26123249293/job/76830355801 ## Expected Pure-Ruby code on a Ractor pool runs to completion or raises a Ruby exception. It does not corrupt the heap. ## Actual SIGSEGV (≈70% of runs without a sanitizer; crash site varies). AddressSanitizer (Ruby 4.0.5, aarch64-linux, 3 Ractors): ``` ERROR: AddressSanitizer: heap-use-after-free READ of size 4 rb_vm_ci_lookup vm_method.c:699 vm_ci_new_ vm_callinfo.h:219 vm_ci_new_runtime_ vm_callinfo.h:240 vm_search_super_method vm_insnhelper.c:5152 (a Ruby `super` call) freed by: rb_gc_impl_free gc/default/default.c:8279 rb_data_free gc.c:1205 rb_gc_obj_free gc.c:1351 gc_sweep_plane gc/default/default.c:3510 previously allocated by: rb_data_typed_object_zalloc gc.c:1131 rbs_new_location2 (rbs gem C extension) SUMMARY: AddressSanitizer: heap-use-after-free vm_method.c:699 in rb_vm_ci_lookup ``` The same line also surfaces as `heap-buffer-overflow`. Original CI `[BUG]` (Ruby 4.0.4, x86_64-linux), same path: ``` [BUG] Segmentation fault vm_ci_hash <- rb_st_update <- rb_vm_ci_lookup (vm_method.c:712) <- vm_ci_new_runtime_ <- vm_search_super_method Total ractor count: 5 ``` ## Notes - ThreadSanitizer reports nothing — it cannot follow the M:N scheduler. - Both ASAN stacks show `thread T0` because the M:N scheduler multiplexes Ractors onto native threads; the run uses 3 Ractors. - The free path is a plain GC sweep (`gc_sweep_plane` → `rb_data_free`), no finalizer. ## Related issues - #21200 — parallel Ractor spurious segfault/hang; root cause unidentified. Possibly the same class of bug. - #21204 — same class of heap corruption, but scoped to MMTk GC. This report reproduces it on the **default GC**. -- https://bugs.ruby-lang.org/
Issue #22075 has been updated by jhawthorn (John Hawthorn). Assignee set to ractor Thank you for the reliable reproduction. I believe the issue is that `super` calls are duplicating callinfos, but the reference count for the kwargs array isn't using atomics. ```ruby class Base def foo(a:, b:, c:) = a end class Sub < Base def foo(a:, b:, c:) = super(a: a, b: b, c: c) end 8.times.map do Ractor.new do obj = Sub.new 3_000_000.times { obj.foo(a: 1, b: 2, c: 3) } end end.each(&:join) ``` We should be able to fix it with atomics. https://github.com/ruby/ruby/pull/17053 ---------------------------------------- Bug #22075: heap-use-after-free in `rb_vm_ci_lookup` under parallel Ractors https://bugs.ruby-lang.org/issues/22075#change-117372 * Author: zonuexe (Kenta USAMI) * Status: Open * Assignee: ractor * ruby -v: 4.0.5 * Backport: 3.3: UNKNOWN, 3.4: UNKNOWN, 4.0: UNKNOWN ---------------------------------------- ## Environment - `ruby 4.0.5 (2026-05-20 revision 64336ffd0e) +PRISM [aarch64-linux]` - Also reproduced on `ruby 4.0.4 ... [x86_64-linux]`. ## Summary Under a parallel Ractor pool, a GC sweep on one Ractor frees a `T_DATA` object while `rb_vm_ci_lookup` on another Ractor reads that freed heap memory. Pure-Ruby code triggers it; the VM crashes (SIGSEGV). ## Reproduction No minimal script isolated yet. Reproduced with the `rigor` project at tag `v0.1.7` (https://github.com/rigortype/rigor/tree/v0.1.7). 1. Build Ruby + native gems with AddressSanitizer: ``` ./configure --disable-yjit \ cflags="-fsanitize=address -fno-sanitize-address-use-after-scope -g -O1" \ cppflags="-fsanitize=address" LDFLAGS="-fsanitize=address" ``` 2. Run a spec that drives a 2-worker Ractor pool: ``` ASAN_OPTIONS="detect_leaks=0 detect_stack_use_after_return=0 halt_on_error=0" \ RIGOR_INCLUDE_RACTOR_POOL=1 \ bundle exec rspec spec/rigor/analysis/runner_pool_spec.rb ``` CI failure: https://github.com/rigortype/rigor/actions/runs/26123249293/job/76830355801 ## Expected Pure-Ruby code on a Ractor pool runs to completion or raises a Ruby exception. It does not corrupt the heap. ## Actual SIGSEGV (≈70% of runs without a sanitizer; crash site varies). AddressSanitizer (Ruby 4.0.5, aarch64-linux, 3 Ractors): ``` ERROR: AddressSanitizer: heap-use-after-free READ of size 4 rb_vm_ci_lookup vm_method.c:699 vm_ci_new_ vm_callinfo.h:219 vm_ci_new_runtime_ vm_callinfo.h:240 vm_search_super_method vm_insnhelper.c:5152 (a Ruby `super` call) freed by: rb_gc_impl_free gc/default/default.c:8279 rb_data_free gc.c:1205 rb_gc_obj_free gc.c:1351 gc_sweep_plane gc/default/default.c:3510 previously allocated by: rb_data_typed_object_zalloc gc.c:1131 rbs_new_location2 (rbs gem C extension) SUMMARY: AddressSanitizer: heap-use-after-free vm_method.c:699 in rb_vm_ci_lookup ``` The same line also surfaces as `heap-buffer-overflow`. Original CI `[BUG]` (Ruby 4.0.4, x86_64-linux), same path: ``` [BUG] Segmentation fault vm_ci_hash <- rb_st_update <- rb_vm_ci_lookup (vm_method.c:712) <- vm_ci_new_runtime_ <- vm_search_super_method Total ractor count: 5 ``` ## Notes - ThreadSanitizer reports nothing — it cannot follow the M:N scheduler. - Both ASAN stacks show `thread T0` because the M:N scheduler multiplexes Ractors onto native threads; the run uses 3 Ractors. - The free path is a plain GC sweep (`gc_sweep_plane` → `rb_data_free`), no finalizer. ## Related issues - #21200 — parallel Ractor spurious segfault/hang; root cause unidentified. Possibly the same class of bug. - #21204 — same class of heap corruption, but scoped to MMTk GC. This report reproduces it on the **default GC**. -- https://bugs.ruby-lang.org/
Issue #22075 has been updated by jhawthorn (John Hawthorn). Backport changed from 3.3: UNKNOWN, 3.4: UNKNOWN, 4.0: UNKNOWN to 3.3: DONTNEED, 3.4: DONTNEED, 4.0: REQUIRED 4.0 backport: https://github.com/ruby/ruby/pull/17055 ---------------------------------------- Bug #22075: heap-use-after-free in `rb_vm_ci_lookup` under parallel Ractors https://bugs.ruby-lang.org/issues/22075#change-117374 * Author: zonuexe (Kenta USAMI) * Status: Closed * Assignee: ractor * ruby -v: 4.0.5 * Backport: 3.3: DONTNEED, 3.4: DONTNEED, 4.0: REQUIRED ---------------------------------------- ## Environment - `ruby 4.0.5 (2026-05-20 revision 64336ffd0e) +PRISM [aarch64-linux]` - Also reproduced on `ruby 4.0.4 ... [x86_64-linux]`. ## Summary Under a parallel Ractor pool, a GC sweep on one Ractor frees a `T_DATA` object while `rb_vm_ci_lookup` on another Ractor reads that freed heap memory. Pure-Ruby code triggers it; the VM crashes (SIGSEGV). ## Reproduction No minimal script isolated yet. Reproduced with the `rigor` project at tag `v0.1.7` (https://github.com/rigortype/rigor/tree/v0.1.7). 1. Build Ruby + native gems with AddressSanitizer: ``` ./configure --disable-yjit \ cflags="-fsanitize=address -fno-sanitize-address-use-after-scope -g -O1" \ cppflags="-fsanitize=address" LDFLAGS="-fsanitize=address" ``` 2. Run a spec that drives a 2-worker Ractor pool: ``` ASAN_OPTIONS="detect_leaks=0 detect_stack_use_after_return=0 halt_on_error=0" \ RIGOR_INCLUDE_RACTOR_POOL=1 \ bundle exec rspec spec/rigor/analysis/runner_pool_spec.rb ``` CI failure: https://github.com/rigortype/rigor/actions/runs/26123249293/job/76830355801 ## Expected Pure-Ruby code on a Ractor pool runs to completion or raises a Ruby exception. It does not corrupt the heap. ## Actual SIGSEGV (≈70% of runs without a sanitizer; crash site varies). AddressSanitizer (Ruby 4.0.5, aarch64-linux, 3 Ractors): ``` ERROR: AddressSanitizer: heap-use-after-free READ of size 4 rb_vm_ci_lookup vm_method.c:699 vm_ci_new_ vm_callinfo.h:219 vm_ci_new_runtime_ vm_callinfo.h:240 vm_search_super_method vm_insnhelper.c:5152 (a Ruby `super` call) freed by: rb_gc_impl_free gc/default/default.c:8279 rb_data_free gc.c:1205 rb_gc_obj_free gc.c:1351 gc_sweep_plane gc/default/default.c:3510 previously allocated by: rb_data_typed_object_zalloc gc.c:1131 rbs_new_location2 (rbs gem C extension) SUMMARY: AddressSanitizer: heap-use-after-free vm_method.c:699 in rb_vm_ci_lookup ``` The same line also surfaces as `heap-buffer-overflow`. Original CI `[BUG]` (Ruby 4.0.4, x86_64-linux), same path: ``` [BUG] Segmentation fault vm_ci_hash <- rb_st_update <- rb_vm_ci_lookup (vm_method.c:712) <- vm_ci_new_runtime_ <- vm_search_super_method Total ractor count: 5 ``` ## Notes - ThreadSanitizer reports nothing — it cannot follow the M:N scheduler. - Both ASAN stacks show `thread T0` because the M:N scheduler multiplexes Ractors onto native threads; the run uses 3 Ractors. - The free path is a plain GC sweep (`gc_sweep_plane` → `rb_data_free`), no finalizer. ## Related issues - #21200 — parallel Ractor spurious segfault/hang; root cause unidentified. Possibly the same class of bug. - #21204 — same class of heap corruption, but scoped to MMTk GC. This report reproduces it on the **default GC**. -- https://bugs.ruby-lang.org/
Issue #22075 has been updated by k0kubun (Takashi Kokubun). Backport changed from 3.3: DONTNEED, 3.4: DONTNEED, 4.0: REQUIRED to 3.3: DONTNEED, 3.4: DONTNEED, 4.0: DONE ruby_4_0 commit:a08f356063abaa6c5defa1314c5c8bcd9cbb5c2b. ---------------------------------------- Bug #22075: heap-use-after-free in `rb_vm_ci_lookup` under parallel Ractors https://bugs.ruby-lang.org/issues/22075#change-117386 * Author: zonuexe (Kenta USAMI) * Status: Closed * Assignee: ractor * ruby -v: 4.0.5 * Backport: 3.3: DONTNEED, 3.4: DONTNEED, 4.0: DONE ---------------------------------------- ## Environment - `ruby 4.0.5 (2026-05-20 revision 64336ffd0e) +PRISM [aarch64-linux]` - Also reproduced on `ruby 4.0.4 ... [x86_64-linux]`. ## Summary Under a parallel Ractor pool, a GC sweep on one Ractor frees a `T_DATA` object while `rb_vm_ci_lookup` on another Ractor reads that freed heap memory. Pure-Ruby code triggers it; the VM crashes (SIGSEGV). ## Reproduction No minimal script isolated yet. Reproduced with the `rigor` project at tag `v0.1.7` (https://github.com/rigortype/rigor/tree/v0.1.7). 1. Build Ruby + native gems with AddressSanitizer: ``` ./configure --disable-yjit \ cflags="-fsanitize=address -fno-sanitize-address-use-after-scope -g -O1" \ cppflags="-fsanitize=address" LDFLAGS="-fsanitize=address" ``` 2. Run a spec that drives a 2-worker Ractor pool: ``` ASAN_OPTIONS="detect_leaks=0 detect_stack_use_after_return=0 halt_on_error=0" \ RIGOR_INCLUDE_RACTOR_POOL=1 \ bundle exec rspec spec/rigor/analysis/runner_pool_spec.rb ``` CI failure: https://github.com/rigortype/rigor/actions/runs/26123249293/job/76830355801 ## Expected Pure-Ruby code on a Ractor pool runs to completion or raises a Ruby exception. It does not corrupt the heap. ## Actual SIGSEGV (≈70% of runs without a sanitizer; crash site varies). AddressSanitizer (Ruby 4.0.5, aarch64-linux, 3 Ractors): ``` ERROR: AddressSanitizer: heap-use-after-free READ of size 4 rb_vm_ci_lookup vm_method.c:699 vm_ci_new_ vm_callinfo.h:219 vm_ci_new_runtime_ vm_callinfo.h:240 vm_search_super_method vm_insnhelper.c:5152 (a Ruby `super` call) freed by: rb_gc_impl_free gc/default/default.c:8279 rb_data_free gc.c:1205 rb_gc_obj_free gc.c:1351 gc_sweep_plane gc/default/default.c:3510 previously allocated by: rb_data_typed_object_zalloc gc.c:1131 rbs_new_location2 (rbs gem C extension) SUMMARY: AddressSanitizer: heap-use-after-free vm_method.c:699 in rb_vm_ci_lookup ``` The same line also surfaces as `heap-buffer-overflow`. Original CI `[BUG]` (Ruby 4.0.4, x86_64-linux), same path: ``` [BUG] Segmentation fault vm_ci_hash <- rb_st_update <- rb_vm_ci_lookup (vm_method.c:712) <- vm_ci_new_runtime_ <- vm_search_super_method Total ractor count: 5 ``` ## Notes - ThreadSanitizer reports nothing — it cannot follow the M:N scheduler. - Both ASAN stacks show `thread T0` because the M:N scheduler multiplexes Ractors onto native threads; the run uses 3 Ractors. - The free path is a plain GC sweep (`gc_sweep_plane` → `rb_data_free`), no finalizer. ## Related issues - #21200 — parallel Ractor spurious segfault/hang; root cause unidentified. Possibly the same class of bug. - #21204 — same class of heap corruption, but scoped to MMTk GC. This report reproduces it on the **default GC**. -- https://bugs.ruby-lang.org/
participants (3)
-
jhawthorn (John Hawthorn) -
k0kubun (Takashi Kokubun) -
zonuexe (Kenta USAMI)