Issue #19156 has been updated by mk (Matthias Käppler).
Thanks for sharing this; I still don't think this is related to the memory map going
away. Not exclusively anyway. My original test case does not close or unmap the counter
file, but it still crashes. The counter is still in scope when we walk the object space,
so it hasn't been GC'ed either. I also verified that `mm_unmap` is not called in
this case.
So there is something else at play where even during the life-time of the metric file i.e.
while still residing in memory, string memory gets corrupted. This also means that the
original fix proposed (keeping a reference to the memory mapped object on the string
itself) is not sufficient.
I will move this discussion to the library issue tracker; thanks for helping out!
----------------------------------------
Bug #19156: ObjectSpace.dump_all segfault during string inspection
https://bugs.ruby-lang.org/issues/19156#change-100438
* Author: mk (Matthias Käppler)
* Status: Third Party's Issue
* Priority: Normal
* ruby -v: ruby 3.0.4p208 (2022-04-12 revision 3fa771dded) [x86_64-linux]
* Backport: 2.7: UNKNOWN, 3.0: UNKNOWN, 3.1: UNKNOWN
----------------------------------------
I am working on a feature that would allow our application to capture heap dumps during
shutdown for later inspection.
These heap dumps are captured via `ObjectSpace.dump_all(output: io)`. While walking the
object space, MRI occasionally segfaults while inspecting string objects in
`search_nonascii` of `string.c`:
```
/usr/local/lib/ruby/3.0.0/objspace.rb:87: [BUG] Segmentation fault at 0x00007efee4201000
ruby 3.0.4p208 (2022-04-12 revision 3fa771dded) [x86_64-linux]
...
-- Control frame information -----------------------------------------------
c:0053 p:---- s:0312 e:000311 CFUNC :_dump_all
c:0052 p:0130 s:0305 e:000304 METHOD /usr/local/lib/ruby/3.0.0/objspace.rb:87
c:0051 p:0023 s:0295 e:000294 METHOD
/home/git/gitlab/lib/gitlab/memory/reports/heap_dump.rb:26
...
-- C level backtrace information -------------------------------------------
/usr/local/lib/libruby.so.3.0(rb_print_backtrace+0x11) [0x7efee4ad0c5e] vm_dump.c:758
/usr/local/lib/libruby.so.3.0(rb_vm_bugreport) vm_dump.c:998
/usr/local/lib/libruby.so.3.0(rb_bug_for_fatal_signal+0xf8) [0x7efee48d0b08] error.c:787
/usr/local/lib/libruby.so.3.0(sigsegv+0x55) [0x7efee4a23db5] signal.c:963
/lib/x86_64-linux-gnu/libpthread.so.0(__restore_rt+0x0) [0x7efee4f12140]
../sysdeps/pthread/funlockfile.c:28
/usr/local/lib/libruby.so.3.0(search_nonascii+0x30) [0x7efee4a3ca60] string.c:552
/usr/local/lib/libruby.so.3.0(coderange_scan) string.c:585
/usr/local/lib/libruby.so.3.0(enc_coderange_scan+0x1b) [0x7efee4a3e28a] string.c:709
/usr/local/lib/libruby.so.3.0(rb_enc_str_coderange) string.c:727
/usr/local/lib/ruby/3.0.0/x86_64-linux/objspace.so(is_broken_string+0x8) [0x7efeced9c304]
../../internal/string.h:116
/usr/local/lib/ruby/3.0.0/x86_64-linux/objspace.so(dump_object) objspace_dump.c:388
/usr/local/lib/ruby/3.0.0/x86_64-linux/objspace.so(heap_i+0x39) [0x7efeced9caa9]
objspace_dump.c:521
/usr/local/lib/libruby.so.3.0(objspace_each_objects_without_setup+0xaf) [0x7efee48e878f]
gc.c:3232
/usr/local/lib/libruby.so.3.0(objspace_each_objects_protected+0x14) [0x7efee48e87c4]
gc.c:3242
/usr/local/lib/libruby.so.3.0(rb_ensure+0x12a) [0x7efee48d96aa] eval.c:1162
/usr/local/lib/libruby.so.3.0(objspace_each_objects+0x28) [0x7efee48fb458] gc.c:3310
/usr/local/lib/libruby.so.3.0(rb_objspace_each_objects) gc.c:3298
/usr/local/lib/ruby/3.0.0/x86_64-linux/objspace.so(objspace_dump_all+0x88)
[0x7efeced9b068] objspace_dump.c:616
...
```
Unfortunately I couldn't get my hands on that memory region to see which strings are
causing this since this doesn't always happen.
I suspect this is also a problem with MRI master since the code looks unchanged from
3.0.4.
--
https://bugs.ruby-lang.org/