
Issue #19606 has been updated by kjtsanaktsidis (KJ Tsanaktsidis). This is also causing test failures on my machine, because there are tests on the output of the bug reporter - e.g. ``` [17591/23407] TestBugReporter#test_bug_reporter_add = 0.15 s 4) Failure: TestBugReporter#test_bug_reporter_add [/home/kj/ruby/test/-ext-/bug_reporter/test_bug_reporter.rb:31]: pid 169551 exit 1 | -:1: [BUG] Segmentation fault at 0x000003e80002964f | ruby 3.3.0dev (2023-04-24T03:48:15Z master 886986b3ef) [x86_64-linux] | | -- Control frame information ----------------------------------------------- | c:0003 p:---- s:0012 e:000011 CFUNC :kill | c:0002 p:0022 s:0006 e:000005 EVAL -:1 [FINISH] | c:0001 p:0000 s:0003 E:001100 DUMMY [FINISH] | | -- Ruby level backtrace information ---------------------------------------- | -:1:in `<main>' | -:1:in `kill' | | -- Threading information --------------------------------------------------- | Total ractor count: 1 | Ruby thread count for this ractor: 1 | | -- Machine register context ------------------------------------------------ | RIP: 0x00007f0793a6adab RBP: 0x000000000000000b RSP: 0x00007ffe339ef5b8 | RAX: 0x0000000000000000 RBX: 0x0000000000000001 RCX: 0x00007f0793a6adab | RDX: 0x000000000002964f RDI: 0x000000000002964f RSI: 0x000000000000000b | R8: 0x0000000000000000 R9: 0x0000000000000000 R10: 0x00007f0793a434e8 | R11: 0x0000000000000206 R12: 0x0000000000000002 R13: 0x00007f0793927048 | R14: 0x000000000002964f R15: 0x0000000000000001 EFL: 0x0000000000000206 | | -- C level backtrace information ------------------------------------------- | 1344: Abbrev Number 27547 not found .. 1. [2/2] Assertion for "stderr" | Expected /Sample bug reporter: 12345/ | to match | "-- Control frame information -----------------------------------------------\n"+ | "c:0003 p:---- s:0012 e:000011 CFUNC :kill\n"+ | "c:0002 p:0022 s:0006 e:000005 EVAL -:1 [FINISH]\n"+ | "c:0001 p:0000 s:0003 E:001100 DUMMY [FINISH]\n\n"+ | "-- Ruby level backtrace information ----------------------------------------\n"+ | "-:1:in `<main>'\n"+ | "-:1:in `kill'\n\n"+ | "-- Threading information ---------------------------------------------------\n"+ | "Total ractor count: 1\n"+ | "Ruby thread count for this ractor: 1\n\n"+ | "-- Machine register context ------------------------------------------------\n"+ | " RIP: 0x00007f0793a6adab RBP: 0x000000000000000b RSP: 0x00007ffe339ef5b8\n"+ | " RAX: 0x0000000000000000 RBX: 0x0000000000000001 RCX: 0x00007f0793a6adab\n"+ | " RDX: 0x000000000002964f RDI: 0x000000000002964f RSI: 0x000000000000000b\n"+ | " R8: 0x0000000000000000 R9: 0x0000000000000000 R10: 0x00007f0793a434e8\n"+ | " R11: 0x0000000000000206 R12: 0x0000000000000002 R13: 0x00007f0793927048\n"+ | " R14: 0x000000000002964f R15: 0x0000000000000001 EFL: 0x0000000000000206\n\n"+ | "-- C level backtrace information -------------------------------------------\n"+ | "1344: Abbrev Number 27547 not found\n" | after 4 patterns with 123 characters. ```` ---------------------------------------- Bug #19606: addr2line.c broken on Fedora 38 https://bugs.ruby-lang.org/issues/19606#change-102907 * Author: kjtsanaktsidis (KJ Tsanaktsidis) * Status: Open * Priority: Normal * Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN ---------------------------------------- I'm running the Fedora 38 beta on my machine, and the Ruby crash reporter is itself crashing while trying to print C-level backtraces, with an error like this: ``` -- C level backtrace information ------------------------------------------- 1344: Abbrev Number 27547 not found ``` This seems to happen because the debuginfo provided by Fedora 38 (for e.g. the libc frames) is compressed with dwz. Some debginfo is shared between multiple files, and a reference to the shared file is put in the `.gnu_debugaltlink` attribute of the debug object. Then, the main debug object contains forms of type `DW_FORM_GNU_ref_alt` and `DW_FORM_GNU_strp_alt`, whose values will refer to offsets inside the debugaltlink file. For example, libc symbols contain a `.gnu_debugaltlink` section ``` % readelf -x .gnu_debugaltlink /usr/lib/debug/lib64/libc.so.6-2.37-1.fc38.x86_64.debug Hex dump of section '.gnu_debugaltlink': 0x00000000 2e2e2f2e 64777a2f 676c6962 632d322e ../.dwz/glibc-2. 0x00000010 33372d31 2e666333 382e7838 365f3634 37-1.fc38.x86_64 0x00000020 002a30a8 c71ff2fd aa58a637 226c0f56 .*0......X.7"l.V 0x00000030 3ffd674a de ?.gJ. ``` Some DWARF abbrev's contain references to this shared data: ``` % readelf --debug-dump /usr/lib/debug/lib64/libc.so.6-2.37-1.fc38.x86_64.debug ... Contents of the .debug_abbrev section (loaded from /usr/lib/debug/lib64/libc.so.6-2.37-1.fc38.x86_64.debug): ... 50 DW_TAG_typedef [no children] DW_AT_name DW_FORM_GNU_strp_alt DW_AT_decl_file DW_FORM_data1 DW_AT_decl_line DW_FORM_data1 DW_AT_decl_column DW_FORM_data1 DW_AT_type DW_FORM_GNU_ref_alt DW_AT value: 0 DW_FORM value: 0 ... ``` Because addr2line.c doesn't know how to read these `DW_FORM_GNU_*` forms in `debug_info_reader_read_value`, it doesn't advance the reader at all and so the DWARF parsing becomes confused. This can lead to a garbage abbrev number being read, or even to just straight up segfaults. I have a patch which simply skips over the right number of bytes for these forms, without actually reading any of the data: https://github.com/ruby/ruby/pull/7731. This was sufficient to make the crash reporter work again properly on my machine. _Technically_ speaking, perhaps we should _actually_ open the `.gnu_debugaltlink` file and dig out the referenced attribute value - this would be required if filename strings were put into the shared dwz info with `DW_FORM_GNU_strp_apt`. If you think this is necessary I can try and put together a follow up patch to do this. However I've not actually seen a debuginfo file that does this, yet. -- https://bugs.ruby-lang.org/