Issue #19156 has been updated by mk (Matthias Käppler).
byroot (Jean Boussier) wrote in #note-5:
Also, based on the backtrace I believe that
`ObjectSpace.each_object(String, &:valid_encoding?)` should cause the same crash.
Indeed!
```
bundle exec rbtrace -p $(pgrep -f 'worker 1') -e
'ObjectSpace.each_object(String, &:valid_encoding?)'
*** run `sudo sysctl kernel.msgmnb=1048576` to prevent losing events (currently: 16384
bytes)
*** attached to process 214
*** timed out waiting for eval response
*** could not detach cleanly from process 214
```
```
(eval):1: [BUG] Segmentation fault at 0x00007fea85bf5000
ruby 3.0.4p208 (2022-04-12 revision 3fa771dded) [x86_64-linux]
-- Control frame information -----------------------------------------------
c:0006 p:---- s:0025 e:000024 CFUNC :valid_encoding?
c:0005 p:---- s:0022 e:000021 CFUNC :each_object
c:0004 p:0023 s:0017 e:000016 EVAL (eval):1 [FINISH]
c:0003 p:---- s:0014 e:000013 CFUNC :eval
c:0002 p:0022 s:0009 e:000005 BLOCK eval:6 [FINISH]
c:0001 p:---- s:0003 e:000002 (none) [FINISH]
-- Ruby level backtrace information ----------------------------------------
eval:6:in `block in eval_and_inspect'
eval:6:in `eval'
(eval):1:in `eval_context'
(eval):1:in `each_object'
(eval):1:in `valid_encoding?'
-- Machine register context ------------------------------------------------
RIP: 0x00007fea85a3ca60 RBP: 0x00007fea576f2720 RSP: 0x00007fea576f26e0
RAX: 0x8080808080808080 RBX: 0x00007fea84654000 RCX: 0x00007fea85bf5ff9
RDX: 0x8080808080808080 RDI: 0x00007fea84654000 RSI: 0x0000000000001000
R8: 0x0000000000000000 R9: 0x0000000000000000 R10: 0x0000000000000000
R11: 0x0000000000000000 R12: 0x00007fea85bf5000 R13: 0x0000000000001000
R14: 0x00007fea85bf6000 R15: 0x00007fea845d06d0 EFL: 0x0000000000010293
-- C level backtrace information -------------------------------------------
/usr/local/lib/libruby.so.3.0(rb_print_backtrace+0x11) [0x7fea85ad0c5e] vm_dump.c:758
/usr/local/lib/libruby.so.3.0(rb_vm_bugreport) vm_dump.c:998
/usr/local/lib/libruby.so.3.0(rb_bug_for_fatal_signal+0xf8) [0x7fea858d0b08] error.c:787
/usr/local/lib/libruby.so.3.0(sigsegv+0x55) [0x7fea85a23db5] signal.c:963
/lib/x86_64-linux-gnu/libpthread.so.0(__restore_rt+0x0) [0x7fea86041140]
../sysdeps/pthread/funlockfile.c:28
/usr/local/lib/libruby.so.3.0(search_nonascii+0x30) [0x7fea85a3ca60] string.c:552
/usr/local/lib/libruby.so.3.0(coderange_scan) string.c:585
/usr/local/lib/libruby.so.3.0(enc_coderange_scan+0x1b) [0x7fea85a3e28a] string.c:709
/usr/local/lib/libruby.so.3.0(rb_enc_str_coderange) string.c:727
/usr/local/lib/libruby.so.3.0(rb_str_valid_encoding_p+0x9) [0x7fea85a3ef99]
string.c:10474
```
The hard part now is to figure out where this string
is allocated. ObjectSpace.trace_object_allocations_start can help here.
..
Also, assuming my str_new_static theory is correct, you could look at: grep -IR
str_new_static $(bundle show --paths) see if anything stands out.
Thank you, these are good ideas, I will have a look.
----------------------------------------
Bug #19156: ObjectSpace.dump_all segfault during string inspection
https://bugs.ruby-lang.org/issues/19156#change-100321
* Author: mk (Matthias Käppler)
* Status: Open
* Priority: Normal
* ruby -v: ruby 3.0.4p208 (2022-04-12 revision 3fa771dded) [x86_64-linux]
* Backport: 2.7: UNKNOWN, 3.0: UNKNOWN, 3.1: UNKNOWN
----------------------------------------
I am working on a feature that would allow our application to capture heap dumps during
shutdown for later inspection.
These heap dumps are captured via `ObjectSpace.dump_all(output: io)`. While walking the
object space, MRI occasionally segfaults while inspecting string objects in
`search_nonascii` of `string.c`:
```
/usr/local/lib/ruby/3.0.0/objspace.rb:87: [BUG] Segmentation fault at 0x00007efee4201000
ruby 3.0.4p208 (2022-04-12 revision 3fa771dded) [x86_64-linux]
...
-- Control frame information -----------------------------------------------
c:0053 p:---- s:0312 e:000311 CFUNC :_dump_all
c:0052 p:0130 s:0305 e:000304 METHOD /usr/local/lib/ruby/3.0.0/objspace.rb:87
c:0051 p:0023 s:0295 e:000294 METHOD
/home/git/gitlab/lib/gitlab/memory/reports/heap_dump.rb:26
...
-- C level backtrace information -------------------------------------------
/usr/local/lib/libruby.so.3.0(rb_print_backtrace+0x11) [0x7efee4ad0c5e] vm_dump.c:758
/usr/local/lib/libruby.so.3.0(rb_vm_bugreport) vm_dump.c:998
/usr/local/lib/libruby.so.3.0(rb_bug_for_fatal_signal+0xf8) [0x7efee48d0b08] error.c:787
/usr/local/lib/libruby.so.3.0(sigsegv+0x55) [0x7efee4a23db5] signal.c:963
/lib/x86_64-linux-gnu/libpthread.so.0(__restore_rt+0x0) [0x7efee4f12140]
../sysdeps/pthread/funlockfile.c:28
/usr/local/lib/libruby.so.3.0(search_nonascii+0x30) [0x7efee4a3ca60] string.c:552
/usr/local/lib/libruby.so.3.0(coderange_scan) string.c:585
/usr/local/lib/libruby.so.3.0(enc_coderange_scan+0x1b) [0x7efee4a3e28a] string.c:709
/usr/local/lib/libruby.so.3.0(rb_enc_str_coderange) string.c:727
/usr/local/lib/ruby/3.0.0/x86_64-linux/objspace.so(is_broken_string+0x8) [0x7efeced9c304]
../../internal/string.h:116
/usr/local/lib/ruby/3.0.0/x86_64-linux/objspace.so(dump_object) objspace_dump.c:388
/usr/local/lib/ruby/3.0.0/x86_64-linux/objspace.so(heap_i+0x39) [0x7efeced9caa9]
objspace_dump.c:521
/usr/local/lib/libruby.so.3.0(objspace_each_objects_without_setup+0xaf) [0x7efee48e878f]
gc.c:3232
/usr/local/lib/libruby.so.3.0(objspace_each_objects_protected+0x14) [0x7efee48e87c4]
gc.c:3242
/usr/local/lib/libruby.so.3.0(rb_ensure+0x12a) [0x7efee48d96aa] eval.c:1162
/usr/local/lib/libruby.so.3.0(objspace_each_objects+0x28) [0x7efee48fb458] gc.c:3310
/usr/local/lib/libruby.so.3.0(rb_objspace_each_objects) gc.c:3298
/usr/local/lib/ruby/3.0.0/x86_64-linux/objspace.so(objspace_dump_all+0x88)
[0x7efeced9b068] objspace_dump.c:616
...
```
Unfortunately I couldn't get my hands on that memory region to see which strings are
causing this since this doesn't always happen.
I suspect this is also a problem with MRI master since the code looks unchanged from
3.0.4.
--
https://bugs.ruby-lang.org/