[ruby-core:114319] [Ruby master Feature#19790] Optionally write Ruby crash reports into a file rather than STDERR

Issue #19790 has been reported by byroot (Jean Boussier). ---------------------------------------- Feature #19790: Optionally write Ruby crash reports into a file rather than STDERR https://bugs.ruby-lang.org/issues/19790 * Author: byroot (Jean Boussier) * Status: Open * Priority: Normal ---------------------------------------- ### Use case On our servers we set [`/proc/sys/kernel/core_pattern`](https://man7.org/linux/man-pages/man5/core.5.html) to point to a small utility that report all the crashes happening in production with the associated core dump into BugSnag. This allowed us to find and fix many Ruby and native extensions bugs in the last few years. However, these are hard to triage, so we'd like to augment these crash reports with the output of `rb_vm_bugreport()`. ### Problem `rb_vm_bugreport()` is hard coded to print to STDERR, this makes it hard to extract and parse the report in a production environment, as very often STDERR is shared with other processes, so the crash report is intertwined with logs from other processes. ### Feature Request It would be very useful if Ruby could write the crash report to an arbitrary path rather than STDERR, akin to `kernel/core_pattern`. Especially it would be useful if it supported interpolating the crashing process PID with `%p` like `kernel/core_pattern`, as it would make it easier to map that report with the core file. This could be controller by an environment variable such as `RUBY_BUGREPORT_PATH`. e.g. ``` RUBY_BUGREPORT_PATH=/var/log/ruby/ruby-crash-pid-%p.log ``` ### Optional Features `kernel/core_pattern` supports other interpolations, however not all of them would make sense for Ruby to support. %% A single % character. %c Core file size soft resource limit of crashing process (since Linux 2.6.24). %d Dump mode—same as value returned by prctl(2) PR_GET_DUMPABLE (since Linux 3.7). %e The process or thread's comm value, which typically is the same as the executable filename (without path prefix, and truncated to a maximum of 15 characters), but may have been modified to be something different; see the discussion of /proc/pid/comm and /proc/pid/task/tid/comm in proc(5). %E Pathname of executable, with slashes ('/') replaced by exclamation marks ('!') (since Linux 3.0). %g Numeric real GID of dumped process. %h Hostname (same as nodename returned by uname(2)). %i TID of thread that triggered core dump, as seen in the PID namespace in which the thread resides (since Linux 3.18). %I TID of thread that triggered core dump, as seen in the initial PID namespace (since Linux 3.18). %p PID of dumped process, as seen in the PID namespace in which the process resides. %P PID of dumped process, as seen in the initial PID namespace (since Linux 3.12). %s Number of signal causing dump. %t Time of dump, expressed as seconds since the Epoch, 1970-01-01 00:00:00 +0000 (UTC). %u Numeric real UID of dumped process. Additionally, if `kernel/core_pattern` starts with a pipe (`|`), then it allows to pipe the core dump to another program, this may also make sense as a feature. ### Prior Art Aside from `kernel/core_pattern`, some other virtual machine have a similar feature, for instance the JVM has a configurable crash report: ``` -XX:ErrorFile=/var/log/java/hs_err_pid%p.log ``` -- https://bugs.ruby-lang.org/

Issue #19790 has been updated by nobu (Nobuyoshi Nakada). https://github.com/nobu/ruby/tree/bugreport_path Implemented `%%`, `%e`, `%E`, `%f`, `%F`, `%p`, `%t` and `%NNN`(octal). When invoking pipe, the string is split with white spaces first without quoting, `%040` would be useful to represent a space character, especially on Windows. ---------------------------------------- Feature #19790: Optionally write Ruby crash reports into a file rather than STDERR https://bugs.ruby-lang.org/issues/19790#change-104084 * Author: byroot (Jean Boussier) * Status: Open * Priority: Normal ---------------------------------------- ### Use case On our servers we set [`/proc/sys/kernel/core_pattern`](https://man7.org/linux/man-pages/man5/core.5.html) to point to a small utility that report all the crashes happening in production with the associated core dump into BugSnag. This allowed us to find and fix many Ruby and native extensions bugs in the last few years. However, these are hard to triage, so we'd like to augment these crash reports with the output of `rb_vm_bugreport()`. ### Problem `rb_vm_bugreport()` is hard coded to print to STDERR, this makes it hard to extract and parse the report in a production environment, as very often STDERR is shared with other processes, so the crash report is intertwined with logs from other processes. ### Feature Request It would be very useful if Ruby could write the crash report to an arbitrary path rather than STDERR, akin to `kernel/core_pattern`. Especially it would be useful if it supported interpolating the crashing process PID with `%p` like `kernel/core_pattern`, as it would make it easier to map that report with the core file. This could be controller by an environment variable such as `RUBY_BUGREPORT_PATH`. e.g. ``` RUBY_BUGREPORT_PATH=/var/log/ruby/ruby-crash-pid-%p.log ``` ### Optional Features `kernel/core_pattern` supports other interpolations, however not all of them would make sense for Ruby to support. %% A single % character. %c Core file size soft resource limit of crashing process (since Linux 2.6.24). %d Dump mode—same as value returned by prctl(2) PR_GET_DUMPABLE (since Linux 3.7). %e The process or thread's comm value, which typically is the same as the executable filename (without path prefix, and truncated to a maximum of 15 characters), but may have been modified to be something different; see the discussion of /proc/pid/comm and /proc/pid/task/tid/comm in proc(5). %E Pathname of executable, with slashes ('/') replaced by exclamation marks ('!') (since Linux 3.0). %g Numeric real GID of dumped process. %h Hostname (same as nodename returned by uname(2)). %i TID of thread that triggered core dump, as seen in the PID namespace in which the thread resides (since Linux 3.18). %I TID of thread that triggered core dump, as seen in the initial PID namespace (since Linux 3.18). %p PID of dumped process, as seen in the PID namespace in which the process resides. %P PID of dumped process, as seen in the initial PID namespace (since Linux 3.12). %s Number of signal causing dump. %t Time of dump, expressed as seconds since the Epoch, 1970-01-01 00:00:00 +0000 (UTC). %u Numeric real UID of dumped process. Additionally, if `kernel/core_pattern` starts with a pipe (`|`), then it allows to pipe the core dump to another program, this may also make sense as a feature. ### Prior Art Aside from `kernel/core_pattern`, some other virtual machine have a similar feature, for instance the JVM has a configurable crash report: ``` -XX:ErrorFile=/var/log/java/hs_err_pid%p.log ``` -- https://bugs.ruby-lang.org/

Issue #19790 has been updated by byroot (Jean Boussier). Thank you @nobu, I wasn't expecting someone to implement it :) I take it that you are supportive of the feature? ---------------------------------------- Feature #19790: Optionally write Ruby crash reports into a file rather than STDERR https://bugs.ruby-lang.org/issues/19790#change-104093 * Author: byroot (Jean Boussier) * Status: Open * Priority: Normal ---------------------------------------- ### Use case On our servers we set [`/proc/sys/kernel/core_pattern`](https://man7.org/linux/man-pages/man5/core.5.html) to point to a small utility that report all the crashes happening in production with the associated core dump into BugSnag. This allowed us to find and fix many Ruby and native extensions bugs in the last few years. However, these are hard to triage, so we'd like to augment these crash reports with the output of `rb_vm_bugreport()`. ### Problem `rb_vm_bugreport()` is hard coded to print to STDERR, this makes it hard to extract and parse the report in a production environment, as very often STDERR is shared with other processes, so the crash report is intertwined with logs from other processes. ### Feature Request It would be very useful if Ruby could write the crash report to an arbitrary path rather than STDERR, akin to `kernel/core_pattern`. Especially it would be useful if it supported interpolating the crashing process PID with `%p` like `kernel/core_pattern`, as it would make it easier to map that report with the core file. This could be controller by an environment variable such as `RUBY_BUGREPORT_PATH`. e.g. ``` RUBY_BUGREPORT_PATH=/var/log/ruby/ruby-crash-pid-%p.log ``` ### Optional Features `kernel/core_pattern` supports other interpolations, however not all of them would make sense for Ruby to support. %% A single % character. %c Core file size soft resource limit of crashing process (since Linux 2.6.24). %d Dump mode—same as value returned by prctl(2) PR_GET_DUMPABLE (since Linux 3.7). %e The process or thread's comm value, which typically is the same as the executable filename (without path prefix, and truncated to a maximum of 15 characters), but may have been modified to be something different; see the discussion of /proc/pid/comm and /proc/pid/task/tid/comm in proc(5). %E Pathname of executable, with slashes ('/') replaced by exclamation marks ('!') (since Linux 3.0). %g Numeric real GID of dumped process. %h Hostname (same as nodename returned by uname(2)). %i TID of thread that triggered core dump, as seen in the PID namespace in which the thread resides (since Linux 3.18). %I TID of thread that triggered core dump, as seen in the initial PID namespace (since Linux 3.18). %p PID of dumped process, as seen in the PID namespace in which the process resides. %P PID of dumped process, as seen in the initial PID namespace (since Linux 3.12). %s Number of signal causing dump. %t Time of dump, expressed as seconds since the Epoch, 1970-01-01 00:00:00 +0000 (UTC). %u Numeric real UID of dumped process. Additionally, if `kernel/core_pattern` starts with a pipe (`|`), then it allows to pipe the core dump to another program, this may also make sense as a feature. ### Prior Art Aside from `kernel/core_pattern`, some other virtual machine have a similar feature, for instance the JVM has a configurable crash report: ``` -XX:ErrorFile=/var/log/java/hs_err_pid%p.log ``` -- https://bugs.ruby-lang.org/

Issue #19790 has been updated by nobu (Nobuyoshi Nakada). byroot (Jean Boussier) wrote in #note-2:
Thank you @nobu, I wasn't expecting someone to implement it :)
I take it that you are supportive of the feature?
I'm positive, but also afraid if this can work well in a signal handler. ---------------------------------------- Feature #19790: Optionally write Ruby crash reports into a file rather than STDERR https://bugs.ruby-lang.org/issues/19790#change-104094 * Author: byroot (Jean Boussier) * Status: Open * Priority: Normal ---------------------------------------- ### Use case On our servers we set [`/proc/sys/kernel/core_pattern`](https://man7.org/linux/man-pages/man5/core.5.html) to point to a small utility that report all the crashes happening in production with the associated core dump into BugSnag. This allowed us to find and fix many Ruby and native extensions bugs in the last few years. However, these are hard to triage, so we'd like to augment these crash reports with the output of `rb_vm_bugreport()`. ### Problem `rb_vm_bugreport()` is hard coded to print to STDERR, this makes it hard to extract and parse the report in a production environment, as very often STDERR is shared with other processes, so the crash report is intertwined with logs from other processes. ### Feature Request It would be very useful if Ruby could write the crash report to an arbitrary path rather than STDERR, akin to `kernel/core_pattern`. Especially it would be useful if it supported interpolating the crashing process PID with `%p` like `kernel/core_pattern`, as it would make it easier to map that report with the core file. This could be controller by an environment variable such as `RUBY_BUGREPORT_PATH`. e.g. ``` RUBY_BUGREPORT_PATH=/var/log/ruby/ruby-crash-pid-%p.log ``` ### Optional Features `kernel/core_pattern` supports other interpolations, however not all of them would make sense for Ruby to support. %% A single % character. %c Core file size soft resource limit of crashing process (since Linux 2.6.24). %d Dump mode—same as value returned by prctl(2) PR_GET_DUMPABLE (since Linux 3.7). %e The process or thread's comm value, which typically is the same as the executable filename (without path prefix, and truncated to a maximum of 15 characters), but may have been modified to be something different; see the discussion of /proc/pid/comm and /proc/pid/task/tid/comm in proc(5). %E Pathname of executable, with slashes ('/') replaced by exclamation marks ('!') (since Linux 3.0). %g Numeric real GID of dumped process. %h Hostname (same as nodename returned by uname(2)). %i TID of thread that triggered core dump, as seen in the PID namespace in which the thread resides (since Linux 3.18). %I TID of thread that triggered core dump, as seen in the initial PID namespace (since Linux 3.18). %p PID of dumped process, as seen in the PID namespace in which the process resides. %P PID of dumped process, as seen in the initial PID namespace (since Linux 3.12). %s Number of signal causing dump. %t Time of dump, expressed as seconds since the Epoch, 1970-01-01 00:00:00 +0000 (UTC). %u Numeric real UID of dumped process. Additionally, if `kernel/core_pattern` starts with a pipe (`|`), then it allows to pipe the core dump to another program, this may also make sense as a feature. ### Prior Art Aside from `kernel/core_pattern`, some other virtual machine have a similar feature, for instance the JVM has a configurable crash report: ``` -XX:ErrorFile=/var/log/java/hs_err_pid%p.log ``` -- https://bugs.ruby-lang.org/

Issue #19790 has been updated by nobu (Nobuyoshi Nakada). Note that the branch includes a testing feature, and is not ready to merge as-is. ---------------------------------------- Feature #19790: Optionally write Ruby crash reports into a file rather than STDERR https://bugs.ruby-lang.org/issues/19790#change-104095 * Author: byroot (Jean Boussier) * Status: Open * Priority: Normal ---------------------------------------- ### Use case On our servers we set [`/proc/sys/kernel/core_pattern`](https://man7.org/linux/man-pages/man5/core.5.html) to point to a small utility that report all the crashes happening in production with the associated core dump into BugSnag. This allowed us to find and fix many Ruby and native extensions bugs in the last few years. However, these are hard to triage, so we'd like to augment these crash reports with the output of `rb_vm_bugreport()`. ### Problem `rb_vm_bugreport()` is hard coded to print to STDERR, this makes it hard to extract and parse the report in a production environment, as very often STDERR is shared with other processes, so the crash report is intertwined with logs from other processes. ### Feature Request It would be very useful if Ruby could write the crash report to an arbitrary path rather than STDERR, akin to `kernel/core_pattern`. Especially it would be useful if it supported interpolating the crashing process PID with `%p` like `kernel/core_pattern`, as it would make it easier to map that report with the core file. This could be controller by an environment variable such as `RUBY_BUGREPORT_PATH`. e.g. ``` RUBY_BUGREPORT_PATH=/var/log/ruby/ruby-crash-pid-%p.log ``` ### Optional Features `kernel/core_pattern` supports other interpolations, however not all of them would make sense for Ruby to support. %% A single % character. %c Core file size soft resource limit of crashing process (since Linux 2.6.24). %d Dump mode—same as value returned by prctl(2) PR_GET_DUMPABLE (since Linux 3.7). %e The process or thread's comm value, which typically is the same as the executable filename (without path prefix, and truncated to a maximum of 15 characters), but may have been modified to be something different; see the discussion of /proc/pid/comm and /proc/pid/task/tid/comm in proc(5). %E Pathname of executable, with slashes ('/') replaced by exclamation marks ('!') (since Linux 3.0). %g Numeric real GID of dumped process. %h Hostname (same as nodename returned by uname(2)). %i TID of thread that triggered core dump, as seen in the PID namespace in which the thread resides (since Linux 3.18). %I TID of thread that triggered core dump, as seen in the initial PID namespace (since Linux 3.18). %p PID of dumped process, as seen in the PID namespace in which the process resides. %P PID of dumped process, as seen in the initial PID namespace (since Linux 3.12). %s Number of signal causing dump. %t Time of dump, expressed as seconds since the Epoch, 1970-01-01 00:00:00 +0000 (UTC). %u Numeric real UID of dumped process. Additionally, if `kernel/core_pattern` starts with a pipe (`|`), then it allows to pipe the core dump to another program, this may also make sense as a feature. ### Prior Art Aside from `kernel/core_pattern`, some other virtual machine have a similar feature, for instance the JVM has a configurable crash report: ``` -XX:ErrorFile=/var/log/java/hs_err_pid%p.log ``` -- https://bugs.ruby-lang.org/

Issue #19790 has been updated by kjtsanaktsidis (KJ Tsanaktsidis).
afraid if this can work well in a signal handler.
The current bugreporter does lots of things which are not _really_ safe to do in a signal handler... even `fprintf` is not async-signal-safe (at least on Linux, according to signal-safety(7)). It also accesses the Ruby stack which of course may be in an inconsistent state. I guess it all works "well enough" most of the time. Looking at your patch, it _looks_ about as signal-safe as the current implementation is. Notably the `sprintf` calls are unsafe (but we're already doing that), and I'm not sure if the `RSTRING_PTR` etc manipulations are safe (but we do things like that in the Ruby backtrace-printing code anyway). Nothing _else_ jumps out to my eyes as notably more hazardous. The only thing I might suggest is to push back the opening of the bugreport file (and parsing the `%`'s) until after the `if (crashing)` check in `rb_vm_bugreport`, so that these unsafety's can't cause recursive crashes. (Also - unrelated to the issue at hand, but I think the `crashing = true` assignment might strictly speaking need a memory fence after it) --- If we wanted Ruby crashdumps to be signal-safe, I guess the _best_ option would be to simply ask the operating system to generate coredumps on crashes, and write some external tooling to do nice things like extract Ruby backtraces out of those coredumps. However that code would be quite platform-specific, and also duplicate decent amounts of Ruby's string/backtrace handling code, so I can see why this hasn't been the approach so far. ---------------------------------------- Feature #19790: Optionally write Ruby crash reports into a file rather than STDERR https://bugs.ruby-lang.org/issues/19790#change-104096 * Author: byroot (Jean Boussier) * Status: Open * Priority: Normal ---------------------------------------- ### Use case On our servers we set [`/proc/sys/kernel/core_pattern`](https://man7.org/linux/man-pages/man5/core.5.html) to point to a small utility that report all the crashes happening in production with the associated core dump into BugSnag. This allowed us to find and fix many Ruby and native extensions bugs in the last few years. However, these are hard to triage, so we'd like to augment these crash reports with the output of `rb_vm_bugreport()`. ### Problem `rb_vm_bugreport()` is hard coded to print to STDERR, this makes it hard to extract and parse the report in a production environment, as very often STDERR is shared with other processes, so the crash report is intertwined with logs from other processes. ### Feature Request It would be very useful if Ruby could write the crash report to an arbitrary path rather than STDERR, akin to `kernel/core_pattern`. Especially it would be useful if it supported interpolating the crashing process PID with `%p` like `kernel/core_pattern`, as it would make it easier to map that report with the core file. This could be controller by an environment variable such as `RUBY_BUGREPORT_PATH`. e.g. ``` RUBY_BUGREPORT_PATH=/var/log/ruby/ruby-crash-pid-%p.log ``` ### Optional Features `kernel/core_pattern` supports other interpolations, however not all of them would make sense for Ruby to support. %% A single % character. %c Core file size soft resource limit of crashing process (since Linux 2.6.24). %d Dump mode—same as value returned by prctl(2) PR_GET_DUMPABLE (since Linux 3.7). %e The process or thread's comm value, which typically is the same as the executable filename (without path prefix, and truncated to a maximum of 15 characters), but may have been modified to be something different; see the discussion of /proc/pid/comm and /proc/pid/task/tid/comm in proc(5). %E Pathname of executable, with slashes ('/') replaced by exclamation marks ('!') (since Linux 3.0). %g Numeric real GID of dumped process. %h Hostname (same as nodename returned by uname(2)). %i TID of thread that triggered core dump, as seen in the PID namespace in which the thread resides (since Linux 3.18). %I TID of thread that triggered core dump, as seen in the initial PID namespace (since Linux 3.18). %p PID of dumped process, as seen in the PID namespace in which the process resides. %P PID of dumped process, as seen in the initial PID namespace (since Linux 3.12). %s Number of signal causing dump. %t Time of dump, expressed as seconds since the Epoch, 1970-01-01 00:00:00 +0000 (UTC). %u Numeric real UID of dumped process. Additionally, if `kernel/core_pattern` starts with a pipe (`|`), then it allows to pipe the core dump to another program, this may also make sense as a feature. ### Prior Art Aside from `kernel/core_pattern`, some other virtual machine have a similar feature, for instance the JVM has a configurable crash report: ``` -XX:ErrorFile=/var/log/java/hs_err_pid%p.log ``` -- https://bugs.ruby-lang.org/

Issue #19790 has been updated by byroot (Jean Boussier).
also afraid if this can work well in a signal handler.
Yeah, as @kjtsanaktsidis said, in theory the current one already use things that aren't async signal safe, so I don't think it really change much here. `open(2)` should be async-signal safe according to https://man7.org/linux/man-pages/man7/signal-safety.7.html, so unless I'm missing something that's the only real addition here. ---------------------------------------- Feature #19790: Optionally write Ruby crash reports into a file rather than STDERR https://bugs.ruby-lang.org/issues/19790#change-104097 * Author: byroot (Jean Boussier) * Status: Open * Priority: Normal ---------------------------------------- ### Use case On our servers we set [`/proc/sys/kernel/core_pattern`](https://man7.org/linux/man-pages/man5/core.5.html) to point to a small utility that report all the crashes happening in production with the associated core dump into BugSnag. This allowed us to find and fix many Ruby and native extensions bugs in the last few years. However, these are hard to triage, so we'd like to augment these crash reports with the output of `rb_vm_bugreport()`. ### Problem `rb_vm_bugreport()` is hard coded to print to STDERR, this makes it hard to extract and parse the report in a production environment, as very often STDERR is shared with other processes, so the crash report is intertwined with logs from other processes. ### Feature Request It would be very useful if Ruby could write the crash report to an arbitrary path rather than STDERR, akin to `kernel/core_pattern`. Especially it would be useful if it supported interpolating the crashing process PID with `%p` like `kernel/core_pattern`, as it would make it easier to map that report with the core file. This could be controller by an environment variable such as `RUBY_BUGREPORT_PATH`. e.g. ``` RUBY_BUGREPORT_PATH=/var/log/ruby/ruby-crash-pid-%p.log ``` ### Optional Features `kernel/core_pattern` supports other interpolations, however not all of them would make sense for Ruby to support. %% A single % character. %c Core file size soft resource limit of crashing process (since Linux 2.6.24). %d Dump mode—same as value returned by prctl(2) PR_GET_DUMPABLE (since Linux 3.7). %e The process or thread's comm value, which typically is the same as the executable filename (without path prefix, and truncated to a maximum of 15 characters), but may have been modified to be something different; see the discussion of /proc/pid/comm and /proc/pid/task/tid/comm in proc(5). %E Pathname of executable, with slashes ('/') replaced by exclamation marks ('!') (since Linux 3.0). %g Numeric real GID of dumped process. %h Hostname (same as nodename returned by uname(2)). %i TID of thread that triggered core dump, as seen in the PID namespace in which the thread resides (since Linux 3.18). %I TID of thread that triggered core dump, as seen in the initial PID namespace (since Linux 3.18). %p PID of dumped process, as seen in the PID namespace in which the process resides. %P PID of dumped process, as seen in the initial PID namespace (since Linux 3.12). %s Number of signal causing dump. %t Time of dump, expressed as seconds since the Epoch, 1970-01-01 00:00:00 +0000 (UTC). %u Numeric real UID of dumped process. Additionally, if `kernel/core_pattern` starts with a pipe (`|`), then it allows to pipe the core dump to another program, this may also make sense as a feature. ### Prior Art Aside from `kernel/core_pattern`, some other virtual machine have a similar feature, for instance the JVM has a configurable crash report: ``` -XX:ErrorFile=/var/log/java/hs_err_pid%p.log ``` -- https://bugs.ruby-lang.org/

Issue #19790 has been updated by ko1 (Koichi Sasada). Does it print to both STDERR and a file? or only to a file? ---------------------------------------- Feature #19790: Optionally write Ruby crash reports into a file rather than STDERR https://bugs.ruby-lang.org/issues/19790#change-104190 * Author: byroot (Jean Boussier) * Status: Open * Priority: Normal ---------------------------------------- ### Use case On our servers we set [`/proc/sys/kernel/core_pattern`](https://man7.org/linux/man-pages/man5/core.5.html) to point to a small utility that report all the crashes happening in production with the associated core dump into BugSnag. This allowed us to find and fix many Ruby and native extensions bugs in the last few years. However, these are hard to triage, so we'd like to augment these crash reports with the output of `rb_vm_bugreport()`. ### Problem `rb_vm_bugreport()` is hard coded to print to STDERR, this makes it hard to extract and parse the report in a production environment, as very often STDERR is shared with other processes, so the crash report is intertwined with logs from other processes. ### Feature Request It would be very useful if Ruby could write the crash report to an arbitrary path rather than STDERR, akin to `kernel/core_pattern`. Especially it would be useful if it supported interpolating the crashing process PID with `%p` like `kernel/core_pattern`, as it would make it easier to map that report with the core file. This could be controller by an environment variable such as `RUBY_BUGREPORT_PATH`. e.g. ``` RUBY_BUGREPORT_PATH=/var/log/ruby/ruby-crash-pid-%p.log ``` ### Optional Features `kernel/core_pattern` supports other interpolations, however not all of them would make sense for Ruby to support. %% A single % character. %c Core file size soft resource limit of crashing process (since Linux 2.6.24). %d Dump mode—same as value returned by prctl(2) PR_GET_DUMPABLE (since Linux 3.7). %e The process or thread's comm value, which typically is the same as the executable filename (without path prefix, and truncated to a maximum of 15 characters), but may have been modified to be something different; see the discussion of /proc/pid/comm and /proc/pid/task/tid/comm in proc(5). %E Pathname of executable, with slashes ('/') replaced by exclamation marks ('!') (since Linux 3.0). %g Numeric real GID of dumped process. %h Hostname (same as nodename returned by uname(2)). %i TID of thread that triggered core dump, as seen in the PID namespace in which the thread resides (since Linux 3.18). %I TID of thread that triggered core dump, as seen in the initial PID namespace (since Linux 3.18). %p PID of dumped process, as seen in the PID namespace in which the process resides. %P PID of dumped process, as seen in the initial PID namespace (since Linux 3.12). %s Number of signal causing dump. %t Time of dump, expressed as seconds since the Epoch, 1970-01-01 00:00:00 +0000 (UTC). %u Numeric real UID of dumped process. Additionally, if `kernel/core_pattern` starts with a pipe (`|`), then it allows to pipe the core dump to another program, this may also make sense as a feature. ### Prior Art Aside from `kernel/core_pattern`, some other virtual machine have a similar feature, for instance the JVM has a configurable crash report: ``` -XX:ErrorFile=/var/log/java/hs_err_pid%p.log ``` -- https://bugs.ruby-lang.org/

Issue #19790 has been updated by byroot (Jean Boussier). I would say only a file if the environment variable is set, with perhaps just a message on stderr saying the report was written into that path. But I don't have a strong opinion about this, if others think it should be written in both, that's workable for me, I just personally don't think it has value. ---------------------------------------- Feature #19790: Optionally write Ruby crash reports into a file rather than STDERR https://bugs.ruby-lang.org/issues/19790#change-104191 * Author: byroot (Jean Boussier) * Status: Open * Priority: Normal ---------------------------------------- ### Use case On our servers we set [`/proc/sys/kernel/core_pattern`](https://man7.org/linux/man-pages/man5/core.5.html) to point to a small utility that report all the crashes happening in production with the associated core dump into BugSnag. This allowed us to find and fix many Ruby and native extensions bugs in the last few years. However, these are hard to triage, so we'd like to augment these crash reports with the output of `rb_vm_bugreport()`. ### Problem `rb_vm_bugreport()` is hard coded to print to STDERR, this makes it hard to extract and parse the report in a production environment, as very often STDERR is shared with other processes, so the crash report is intertwined with logs from other processes. ### Feature Request It would be very useful if Ruby could write the crash report to an arbitrary path rather than STDERR, akin to `kernel/core_pattern`. Especially it would be useful if it supported interpolating the crashing process PID with `%p` like `kernel/core_pattern`, as it would make it easier to map that report with the core file. This could be controller by an environment variable such as `RUBY_BUGREPORT_PATH`. e.g. ``` RUBY_BUGREPORT_PATH=/var/log/ruby/ruby-crash-pid-%p.log ``` ### Optional Features `kernel/core_pattern` supports other interpolations, however not all of them would make sense for Ruby to support. %% A single % character. %c Core file size soft resource limit of crashing process (since Linux 2.6.24). %d Dump mode—same as value returned by prctl(2) PR_GET_DUMPABLE (since Linux 3.7). %e The process or thread's comm value, which typically is the same as the executable filename (without path prefix, and truncated to a maximum of 15 characters), but may have been modified to be something different; see the discussion of /proc/pid/comm and /proc/pid/task/tid/comm in proc(5). %E Pathname of executable, with slashes ('/') replaced by exclamation marks ('!') (since Linux 3.0). %g Numeric real GID of dumped process. %h Hostname (same as nodename returned by uname(2)). %i TID of thread that triggered core dump, as seen in the PID namespace in which the thread resides (since Linux 3.18). %I TID of thread that triggered core dump, as seen in the initial PID namespace (since Linux 3.18). %p PID of dumped process, as seen in the PID namespace in which the process resides. %P PID of dumped process, as seen in the initial PID namespace (since Linux 3.12). %s Number of signal causing dump. %t Time of dump, expressed as seconds since the Epoch, 1970-01-01 00:00:00 +0000 (UTC). %u Numeric real UID of dumped process. Additionally, if `kernel/core_pattern` starts with a pipe (`|`), then it allows to pipe the core dump to another program, this may also make sense as a feature. ### Prior Art Aside from `kernel/core_pattern`, some other virtual machine have a similar feature, for instance the JVM has a configurable crash report: ``` -XX:ErrorFile=/var/log/java/hs_err_pid%p.log ``` -- https://bugs.ruby-lang.org/

Issue #19790 has been updated by matz (Yukihiro Matsumoto). I accept this proposal, at least we can experiment. I am a bit worried about calling open(2) in other functions/system calls from the signal handler. Matz. ---------------------------------------- Feature #19790: Optionally write Ruby crash reports into a file rather than STDERR https://bugs.ruby-lang.org/issues/19790#change-104268 * Author: byroot (Jean Boussier) * Status: Open * Priority: Normal ---------------------------------------- ### Use case On our servers we set [`/proc/sys/kernel/core_pattern`](https://man7.org/linux/man-pages/man5/core.5.html) to point to a small utility that report all the crashes happening in production with the associated core dump into BugSnag. This allowed us to find and fix many Ruby and native extensions bugs in the last few years. However, these are hard to triage, so we'd like to augment these crash reports with the output of `rb_vm_bugreport()`. ### Problem `rb_vm_bugreport()` is hard coded to print to STDERR, this makes it hard to extract and parse the report in a production environment, as very often STDERR is shared with other processes, so the crash report is intertwined with logs from other processes. ### Feature Request It would be very useful if Ruby could write the crash report to an arbitrary path rather than STDERR, akin to `kernel/core_pattern`. Especially it would be useful if it supported interpolating the crashing process PID with `%p` like `kernel/core_pattern`, as it would make it easier to map that report with the core file. This could be controller by an environment variable such as `RUBY_BUGREPORT_PATH`. e.g. ``` RUBY_BUGREPORT_PATH=/var/log/ruby/ruby-crash-pid-%p.log ``` ### Optional Features `kernel/core_pattern` supports other interpolations, however not all of them would make sense for Ruby to support. %% A single % character. %c Core file size soft resource limit of crashing process (since Linux 2.6.24). %d Dump mode—same as value returned by prctl(2) PR_GET_DUMPABLE (since Linux 3.7). %e The process or thread's comm value, which typically is the same as the executable filename (without path prefix, and truncated to a maximum of 15 characters), but may have been modified to be something different; see the discussion of /proc/pid/comm and /proc/pid/task/tid/comm in proc(5). %E Pathname of executable, with slashes ('/') replaced by exclamation marks ('!') (since Linux 3.0). %g Numeric real GID of dumped process. %h Hostname (same as nodename returned by uname(2)). %i TID of thread that triggered core dump, as seen in the PID namespace in which the thread resides (since Linux 3.18). %I TID of thread that triggered core dump, as seen in the initial PID namespace (since Linux 3.18). %p PID of dumped process, as seen in the PID namespace in which the process resides. %P PID of dumped process, as seen in the initial PID namespace (since Linux 3.12). %s Number of signal causing dump. %t Time of dump, expressed as seconds since the Epoch, 1970-01-01 00:00:00 +0000 (UTC). %u Numeric real UID of dumped process. Additionally, if `kernel/core_pattern` starts with a pipe (`|`), then it allows to pipe the core dump to another program, this may also make sense as a feature. ### Prior Art Aside from `kernel/core_pattern`, some other virtual machine have a similar feature, for instance the JVM has a configurable crash report: ``` -XX:ErrorFile=/var/log/java/hs_err_pid%p.log ``` -- https://bugs.ruby-lang.org/

Issue #19790 has been updated by byroot (Jean Boussier).
I accept this proposal
Thank you!
I am a bit worried about calling open(2) in other functions/system calls from the signal handler.
Unless I'm misreading it, POSIX says `open(2)` should be async signal safe: https://man7.org/linux/man-pages/man7/signal-safety.7.html so we should be good. Either way, I think the crash reporter is understood as being "best effort", in rare cases, it can already happens that generating it cause a recursive crash, as long as it works in the vast majority of cases, it's not the end of the world. At this stage all bets are off anyway. @nobu do you wish to finish your branch? Let me know if I can help in any way. ---------------------------------------- Feature #19790: Optionally write Ruby crash reports into a file rather than STDERR https://bugs.ruby-lang.org/issues/19790#change-104269 * Author: byroot (Jean Boussier) * Status: Open * Priority: Normal ---------------------------------------- ### Use case On our servers we set [`/proc/sys/kernel/core_pattern`](https://man7.org/linux/man-pages/man5/core.5.html) to point to a small utility that report all the crashes happening in production with the associated core dump into BugSnag. This allowed us to find and fix many Ruby and native extensions bugs in the last few years. However, these are hard to triage, so we'd like to augment these crash reports with the output of `rb_vm_bugreport()`. ### Problem `rb_vm_bugreport()` is hard coded to print to STDERR, this makes it hard to extract and parse the report in a production environment, as very often STDERR is shared with other processes, so the crash report is intertwined with logs from other processes. ### Feature Request It would be very useful if Ruby could write the crash report to an arbitrary path rather than STDERR, akin to `kernel/core_pattern`. Especially it would be useful if it supported interpolating the crashing process PID with `%p` like `kernel/core_pattern`, as it would make it easier to map that report with the core file. This could be controller by an environment variable such as `RUBY_BUGREPORT_PATH`. e.g. ``` RUBY_BUGREPORT_PATH=/var/log/ruby/ruby-crash-pid-%p.log ``` ### Optional Features `kernel/core_pattern` supports other interpolations, however not all of them would make sense for Ruby to support. %% A single % character. %c Core file size soft resource limit of crashing process (since Linux 2.6.24). %d Dump mode—same as value returned by prctl(2) PR_GET_DUMPABLE (since Linux 3.7). %e The process or thread's comm value, which typically is the same as the executable filename (without path prefix, and truncated to a maximum of 15 characters), but may have been modified to be something different; see the discussion of /proc/pid/comm and /proc/pid/task/tid/comm in proc(5). %E Pathname of executable, with slashes ('/') replaced by exclamation marks ('!') (since Linux 3.0). %g Numeric real GID of dumped process. %h Hostname (same as nodename returned by uname(2)). %i TID of thread that triggered core dump, as seen in the PID namespace in which the thread resides (since Linux 3.18). %I TID of thread that triggered core dump, as seen in the initial PID namespace (since Linux 3.18). %p PID of dumped process, as seen in the PID namespace in which the process resides. %P PID of dumped process, as seen in the initial PID namespace (since Linux 3.12). %s Number of signal causing dump. %t Time of dump, expressed as seconds since the Epoch, 1970-01-01 00:00:00 +0000 (UTC). %u Numeric real UID of dumped process. Additionally, if `kernel/core_pattern` starts with a pipe (`|`), then it allows to pipe the core dump to another program, this may also make sense as a feature. ### Prior Art Aside from `kernel/core_pattern`, some other virtual machine have a similar feature, for instance the JVM has a configurable crash report: ``` -XX:ErrorFile=/var/log/java/hs_err_pid%p.log ``` -- https://bugs.ruby-lang.org/

Issue #19790 has been updated by byroot (Jean Boussier). A note from the dev meeting logs, the env var that was accepted is `RUBY_CRASH_REPORT` (not `RUBY_BUGREPORT_PATH`). ---------------------------------------- Feature #19790: Optionally write Ruby crash reports into a file rather than STDERR https://bugs.ruby-lang.org/issues/19790#change-104408 * Author: byroot (Jean Boussier) * Status: Open * Priority: Normal ---------------------------------------- ### Use case On our servers we set [`/proc/sys/kernel/core_pattern`](https://man7.org/linux/man-pages/man5/core.5.html) to point to a small utility that report all the crashes happening in production with the associated core dump into BugSnag. This allowed us to find and fix many Ruby and native extensions bugs in the last few years. However, these are hard to triage, so we'd like to augment these crash reports with the output of `rb_vm_bugreport()`. ### Problem `rb_vm_bugreport()` is hard coded to print to STDERR, this makes it hard to extract and parse the report in a production environment, as very often STDERR is shared with other processes, so the crash report is intertwined with logs from other processes. ### Feature Request It would be very useful if Ruby could write the crash report to an arbitrary path rather than STDERR, akin to `kernel/core_pattern`. Especially it would be useful if it supported interpolating the crashing process PID with `%p` like `kernel/core_pattern`, as it would make it easier to map that report with the core file. This could be controller by an environment variable such as `RUBY_BUGREPORT_PATH`. e.g. ``` RUBY_BUGREPORT_PATH=/var/log/ruby/ruby-crash-pid-%p.log ``` ### Optional Features `kernel/core_pattern` supports other interpolations, however not all of them would make sense for Ruby to support. %% A single % character. %c Core file size soft resource limit of crashing process (since Linux 2.6.24). %d Dump mode—same as value returned by prctl(2) PR_GET_DUMPABLE (since Linux 3.7). %e The process or thread's comm value, which typically is the same as the executable filename (without path prefix, and truncated to a maximum of 15 characters), but may have been modified to be something different; see the discussion of /proc/pid/comm and /proc/pid/task/tid/comm in proc(5). %E Pathname of executable, with slashes ('/') replaced by exclamation marks ('!') (since Linux 3.0). %g Numeric real GID of dumped process. %h Hostname (same as nodename returned by uname(2)). %i TID of thread that triggered core dump, as seen in the PID namespace in which the thread resides (since Linux 3.18). %I TID of thread that triggered core dump, as seen in the initial PID namespace (since Linux 3.18). %p PID of dumped process, as seen in the PID namespace in which the process resides. %P PID of dumped process, as seen in the initial PID namespace (since Linux 3.12). %s Number of signal causing dump. %t Time of dump, expressed as seconds since the Epoch, 1970-01-01 00:00:00 +0000 (UTC). %u Numeric real UID of dumped process. Additionally, if `kernel/core_pattern` starts with a pipe (`|`), then it allows to pipe the core dump to another program, this may also make sense as a feature. ### Prior Art Aside from `kernel/core_pattern`, some other virtual machine have a similar feature, for instance the JVM has a configurable crash report: ``` -XX:ErrorFile=/var/log/java/hs_err_pid%p.log ``` -- https://bugs.ruby-lang.org/

Issue #19790 has been updated by byroot (Jean Boussier). I've backported @nobu's branch on our 3.2 rubies and it's been working very well. Not too sure what is missing. ---------------------------------------- Feature #19790: Optionally write Ruby crash reports into a file rather than STDERR https://bugs.ruby-lang.org/issues/19790#change-104532 * Author: byroot (Jean Boussier) * Status: Open * Priority: Normal ---------------------------------------- ### Use case On our servers we set [`/proc/sys/kernel/core_pattern`](https://man7.org/linux/man-pages/man5/core.5.html) to point to a small utility that report all the crashes happening in production with the associated core dump into BugSnag. This allowed us to find and fix many Ruby and native extensions bugs in the last few years. However, these are hard to triage, so we'd like to augment these crash reports with the output of `rb_vm_bugreport()`. ### Problem `rb_vm_bugreport()` is hard coded to print to STDERR, this makes it hard to extract and parse the report in a production environment, as very often STDERR is shared with other processes, so the crash report is intertwined with logs from other processes. ### Feature Request It would be very useful if Ruby could write the crash report to an arbitrary path rather than STDERR, akin to `kernel/core_pattern`. Especially it would be useful if it supported interpolating the crashing process PID with `%p` like `kernel/core_pattern`, as it would make it easier to map that report with the core file. This could be controller by an environment variable such as `RUBY_BUGREPORT_PATH`. e.g. ``` RUBY_BUGREPORT_PATH=/var/log/ruby/ruby-crash-pid-%p.log ``` ### Optional Features `kernel/core_pattern` supports other interpolations, however not all of them would make sense for Ruby to support. %% A single % character. %c Core file size soft resource limit of crashing process (since Linux 2.6.24). %d Dump mode—same as value returned by prctl(2) PR_GET_DUMPABLE (since Linux 3.7). %e The process or thread's comm value, which typically is the same as the executable filename (without path prefix, and truncated to a maximum of 15 characters), but may have been modified to be something different; see the discussion of /proc/pid/comm and /proc/pid/task/tid/comm in proc(5). %E Pathname of executable, with slashes ('/') replaced by exclamation marks ('!') (since Linux 3.0). %g Numeric real GID of dumped process. %h Hostname (same as nodename returned by uname(2)). %i TID of thread that triggered core dump, as seen in the PID namespace in which the thread resides (since Linux 3.18). %I TID of thread that triggered core dump, as seen in the initial PID namespace (since Linux 3.18). %p PID of dumped process, as seen in the PID namespace in which the process resides. %P PID of dumped process, as seen in the initial PID namespace (since Linux 3.12). %s Number of signal causing dump. %t Time of dump, expressed as seconds since the Epoch, 1970-01-01 00:00:00 +0000 (UTC). %u Numeric real UID of dumped process. Additionally, if `kernel/core_pattern` starts with a pipe (`|`), then it allows to pipe the core dump to another program, this may also make sense as a feature. ### Prior Art Aside from `kernel/core_pattern`, some other virtual machine have a similar feature, for instance the JVM has a configurable crash report: ``` -XX:ErrorFile=/var/log/java/hs_err_pid%p.log ``` -- https://bugs.ruby-lang.org/
participants (5)
-
byroot (Jean Boussier)
-
kjtsanaktsidis (KJ Tsanaktsidis)
-
ko1 (Koichi Sasada)
-
matz (Yukihiro Matsumoto)
-
nobu (Nobuyoshi Nakada)