[ruby-core:116134] [Ruby master Bug#20174] Ruby 3.2 jit_cont_free segfault with YJIT enabled

Issue #20174 has been reported by ziggythehamster (Keith Gable). ---------------------------------------- Bug #20174: Ruby 3.2 jit_cont_free segfault with YJIT enabled https://bugs.ruby-lang.org/issues/20174 * Author: ziggythehamster (Keith Gable) * Status: Open * Priority: Normal * ruby -v: ruby 3.2.2 (2023-03-30 revision e51014f9c0) +YJIT [x86_64-linux] * Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN, 3.3: UNKNOWN ---------------------------------------- Ruby 3.2 segfaults reproducibly for us on aarch64 (Graviton2) and x86_64 with YJIT enabled ... however all of my attempts to make a minimal reproducible test case have failed. The control frame of the segfault isn't consistent, but the C backtrace is. It doesn't occur in 3.3, so I used the backtrace to find the function (jit_cont_free) and compare it between 3.2 and 3.3. The only change that seemed like a plausible candidate was a one-line guard added by k0kubun in e07e9f8491d9ab8b22d2bdf6a8aeba834dac7eef, so I added .patch to the end of the URL on GitHub and added that as a patch against 3.2. This resolved the problem. I would therefore suggest backporting that change from 3.3 to 3.2 :). The change that triggered this is a two line change to a gemspec (that we need to refactor) that we made because Rake released a new version, which I will censor and annotate below: ``` # frozen_string_literal: true $LOAD_PATH.push File.expand_path('lib', __dir__) require 'rubygems/dependency_installer' # before: Gem::DependencyInstaller.new.install(Gem::Dependency.new('rake')) # after: Gem::DependencyInstaller.new.install(Gem::Dependency.new('rake', '~> 13.1.0')) require 'rake/file_list' Gem::Specification.new do |spec| spec.name = 'censored' spec.version = '0.0.1.pre' spec.author = 'censored' spec.email = 'censored' spec.summary = 'censored' spec.description = 'censored' spec.homepage = 'censored' spec.license = 'All rights reserved' spec.required_ruby_version = '>= 2.6.0' spec.metadata['homepage_uri'] = spec.homepage gitignore = File.read('.gitignore').lines.reject { |l| l.match?(/\A\s*#/) || l.match?(/\A\s*\z/) }.map(&:chomp) spec.files = Rake::FileList.new('\.[a-zA-Z0-9]*', '\.[a-zA-Z0-9]*/*', '**/*') .exclude(gitignore) .reject { |f| File.directory?(f) || f.match(%r{\A(test|spec|features|vendor|.git|.bundle)/}) } to_include = gitignore.select { |l| l.match?(/\A\s*!/) }.map { |l| l.delete_prefix('!') } spec.files += Rake::FileList.new(to_include).reject { |f| File.directory?(f) } spec.require_paths = ['lib'] spec.add_development_dependency 'awesome_print', '~> 1.8.0' spec.add_development_dependency 'pry', '~> 0.14.2' # before: spec.add_development_dependency 'rake', '~> 13.0.1' # after: spec.add_development_dependency 'rake', '~> 13.1.0' spec.add_development_dependency 'rdoc', '~> 6.3.1' spec.add_development_dependency 'rspec', '~> 3.11.0' spec.add_development_dependency 'rspec_junit_formatter', '~> 0.5.1' spec.add_development_dependency 'rubocop', '~> 1.39.0' spec.add_development_dependency 'rubocop-packaging', '~> 0.5.2' spec.add_development_dependency 'rubocop-rake', '~> 0.6.0' spec.add_development_dependency 'rubocop-rspec', '~> 2.12.1' spec.add_development_dependency 'simplecov', '~> 0.21.2' spec.add_development_dependency 'simplecov-cobertura', '~> 2.1.0' spec.add_development_dependency 'yard', '~> 0.9.25' spec.add_runtime_dependency 'activesupport', '>= 5.1.7', '< 8' spec.add_runtime_dependency 'censored-m', '~> 0.1.72' spec.add_runtime_dependency 'censored-r', '~> 0.1.175' spec.add_runtime_dependency 'aws-sdk-athena', '~> 1.43' spec.add_runtime_dependency 'aws-sdk-cloudwatch', '~> 1.5' spec.add_runtime_dependency 'aws-sdk-core', '~> 3.122' spec.add_runtime_dependency 'aws-sdk-dynamodb', '~> 1.5' spec.add_runtime_dependency 'aws-sdk-firehose', '~> 1.1' spec.add_runtime_dependency 'aws-sdk-glue', '~> 1.108' spec.add_runtime_dependency 'aws-sdk-kinesis', '~> 1.13' spec.add_runtime_dependency 'aws-sdk-redshift', '~> 1.2' spec.add_runtime_dependency 'aws-sdk-s3', '~> 1.9' spec.add_runtime_dependency 'aws-sdk-sns', '~> 1.3' spec.add_runtime_dependency 'aws-sdk-sqs', '~> 1.3' spec.add_runtime_dependency 'aws-sdk-ssm', '~> 1.76' spec.add_runtime_dependency 'concurrent-ruby', '>= 1.1.5' spec.add_runtime_dependency 'dry-configurable', '~> 0.13' end ``` However, I cannot turn this gemspec into a reproducer on my end. The smallest change makes the segfault go away. To avoid a gigantic issue description, I have attached the censored segfault backtraces and the RbConfig from the x86_64 build (since I'm compiling my own Ruby). I'll note again that while aarch64 and x86_64 appear to have failed while doing something in optparse this time, it appears random (or more likely: GC pressure related). I've also seen it fail when doing a require_relative much earlier. It always fails with the same C backtrace. I have absolutely no idea why `cont` might be NULL. The backtrace shows it is called by `cont_free`, which has a `VM_ASSERT` for detecting this condition. Obviously, there must be some situation where jit_cont becomes NULL due to YJIT, but I have no idea what that situation is. In case someone thinks this might be compiler/compiler option related, I am using the following: * Amazon Linux 2 * LLVM/Clang 11 except that because Amazon Linux doesn't ship lld, we are using `gcc10-ld.gold` as the linker * rustc 1.68.2 (9eb3afe9e 2023-03-27) (Amazon Linux 1.68.2-1.amzn2.0.3) * OpenSSL 3.0.12 (self-compiled with corp-dictated hardened configuration options ... I can share if someone thinks this is relevant) * `extra_warnflags="-Wno-address-of-packed-member -Wno-declaration-after-statement -Wno-register"` * aarch64: `optflags="-O3 -mcpu=neoverse-n1"` * x86_64: `optflags="-O3 -march=sandybridge"` ---Files-------------------------------- bt_aarch64.txt (34.5 KB) bt_x86_64.txt (30.3 KB) rbconfig_x86_64.txt (9.36 KB) -- https://bugs.ruby-lang.org/

Issue #20174 has been updated by byroot (Jean Boussier). Status changed from Open to Closed Backport changed from 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN, 3.3: UNKNOWN to 3.0: DONTNEED, 3.1: DONTNEED, 3.2: REQUIRED, 3.3: DONTNEED Thanks for the report. Editing the issue to mark this commit for backport. Commit to backport: `e07e9f8491d9ab8b22d2bdf6a8aeba834dac7eef` ---------------------------------------- Bug #20174: Ruby 3.2 jit_cont_free segfault with YJIT enabled https://bugs.ruby-lang.org/issues/20174#change-106142 * Author: ziggythehamster (Keith Gable) * Status: Closed * Priority: Normal * ruby -v: ruby 3.2.2 (2023-03-30 revision e51014f9c0) +YJIT [x86_64-linux] * Backport: 3.0: DONTNEED, 3.1: DONTNEED, 3.2: REQUIRED, 3.3: DONTNEED ---------------------------------------- Ruby 3.2 segfaults reproducibly for us on aarch64 (Graviton2) and x86_64 with YJIT enabled ... however all of my attempts to make a minimal reproducible test case have failed. The control frame of the segfault isn't consistent, but the C backtrace is. It doesn't occur in 3.3, so I used the backtrace to find the function (jit_cont_free) and compare it between 3.2 and 3.3. The only change that seemed like a plausible candidate was a one-line guard added by k0kubun in e07e9f8491d9ab8b22d2bdf6a8aeba834dac7eef, so I added .patch to the end of the URL on GitHub and added that as a patch against 3.2. This resolved the problem. I would therefore suggest backporting that change from 3.3 to 3.2 :). The change that triggered this is a two line change to a gemspec (that we need to refactor) that we made because Rake released a new version, which I will censor and annotate below: ``` # frozen_string_literal: true $LOAD_PATH.push File.expand_path('lib', __dir__) require 'rubygems/dependency_installer' # before: Gem::DependencyInstaller.new.install(Gem::Dependency.new('rake')) # after: Gem::DependencyInstaller.new.install(Gem::Dependency.new('rake', '~> 13.1.0')) require 'rake/file_list' Gem::Specification.new do |spec| spec.name = 'censored' spec.version = '0.0.1.pre' spec.author = 'censored' spec.email = 'censored' spec.summary = 'censored' spec.description = 'censored' spec.homepage = 'censored' spec.license = 'All rights reserved' spec.required_ruby_version = '>= 2.6.0' spec.metadata['homepage_uri'] = spec.homepage gitignore = File.read('.gitignore').lines.reject { |l| l.match?(/\A\s*#/) || l.match?(/\A\s*\z/) }.map(&:chomp) spec.files = Rake::FileList.new('\.[a-zA-Z0-9]*', '\.[a-zA-Z0-9]*/*', '**/*') .exclude(gitignore) .reject { |f| File.directory?(f) || f.match(%r{\A(test|spec|features|vendor|.git|.bundle)/}) } to_include = gitignore.select { |l| l.match?(/\A\s*!/) }.map { |l| l.delete_prefix('!') } spec.files += Rake::FileList.new(to_include).reject { |f| File.directory?(f) } spec.require_paths = ['lib'] spec.add_development_dependency 'awesome_print', '~> 1.8.0' spec.add_development_dependency 'pry', '~> 0.14.2' # before: spec.add_development_dependency 'rake', '~> 13.0.1' # after: spec.add_development_dependency 'rake', '~> 13.1.0' spec.add_development_dependency 'rdoc', '~> 6.3.1' spec.add_development_dependency 'rspec', '~> 3.11.0' spec.add_development_dependency 'rspec_junit_formatter', '~> 0.5.1' spec.add_development_dependency 'rubocop', '~> 1.39.0' spec.add_development_dependency 'rubocop-packaging', '~> 0.5.2' spec.add_development_dependency 'rubocop-rake', '~> 0.6.0' spec.add_development_dependency 'rubocop-rspec', '~> 2.12.1' spec.add_development_dependency 'simplecov', '~> 0.21.2' spec.add_development_dependency 'simplecov-cobertura', '~> 2.1.0' spec.add_development_dependency 'yard', '~> 0.9.25' spec.add_runtime_dependency 'activesupport', '>= 5.1.7', '< 8' spec.add_runtime_dependency 'censored-m', '~> 0.1.72' spec.add_runtime_dependency 'censored-r', '~> 0.1.175' spec.add_runtime_dependency 'aws-sdk-athena', '~> 1.43' spec.add_runtime_dependency 'aws-sdk-cloudwatch', '~> 1.5' spec.add_runtime_dependency 'aws-sdk-core', '~> 3.122' spec.add_runtime_dependency 'aws-sdk-dynamodb', '~> 1.5' spec.add_runtime_dependency 'aws-sdk-firehose', '~> 1.1' spec.add_runtime_dependency 'aws-sdk-glue', '~> 1.108' spec.add_runtime_dependency 'aws-sdk-kinesis', '~> 1.13' spec.add_runtime_dependency 'aws-sdk-redshift', '~> 1.2' spec.add_runtime_dependency 'aws-sdk-s3', '~> 1.9' spec.add_runtime_dependency 'aws-sdk-sns', '~> 1.3' spec.add_runtime_dependency 'aws-sdk-sqs', '~> 1.3' spec.add_runtime_dependency 'aws-sdk-ssm', '~> 1.76' spec.add_runtime_dependency 'concurrent-ruby', '>= 1.1.5' spec.add_runtime_dependency 'dry-configurable', '~> 0.13' end ``` However, I cannot turn this gemspec into a reproducer on my end. The smallest change makes the segfault go away. To avoid a gigantic issue description, I have attached the censored segfault backtraces and the RbConfig from the x86_64 build (since I'm compiling my own Ruby). I'll note again that while aarch64 and x86_64 appear to have failed while doing something in optparse this time, it appears random (or more likely: GC pressure related). I've also seen it fail when doing a require_relative much earlier. It always fails with the same C backtrace. I have absolutely no idea why `cont` might be NULL. The backtrace shows it is called by `cont_free`, which has a `VM_ASSERT` for detecting this condition. Obviously, there must be some situation where jit_cont becomes NULL due to YJIT, but I have no idea what that situation is. In case someone thinks this might be compiler/compiler option related, I am using the following: * Amazon Linux 2 * LLVM/Clang 11 except that because Amazon Linux doesn't ship lld, we are using `gcc10-ld.gold` as the linker * rustc 1.68.2 (9eb3afe9e 2023-03-27) (Amazon Linux 1.68.2-1.amzn2.0.3) * OpenSSL 3.0.12 (self-compiled with corp-dictated hardened configuration options ... I can share if someone thinks this is relevant) * `extra_warnflags="-Wno-address-of-packed-member -Wno-declaration-after-statement -Wno-register"` * aarch64: `optflags="-O3 -mcpu=neoverse-n1"` * x86_64: `optflags="-O3 -march=sandybridge"` ---Files-------------------------------- bt_aarch64.txt (34.5 KB) bt_x86_64.txt (30.3 KB) rbconfig_x86_64.txt (9.36 KB) -- https://bugs.ruby-lang.org/

Issue #20174 has been updated by ziggythehamster (Keith Gable). byroot (Jean Boussier) wrote in #note-1:
Thanks for the report. Editing the issue to mark this commit for backport.
Commit to backport: `e07e9f8491d9ab8b22d2bdf6a8aeba834dac7eef`
This being my first bug - did you mean to make it status Closed? ---------------------------------------- Bug #20174: Ruby 3.2 jit_cont_free segfault with YJIT enabled https://bugs.ruby-lang.org/issues/20174#change-106161 * Author: ziggythehamster (Keith Gable) * Status: Closed * Priority: Normal * ruby -v: ruby 3.2.2 (2023-03-30 revision e51014f9c0) +YJIT [x86_64-linux] * Backport: 3.0: DONTNEED, 3.1: DONTNEED, 3.2: REQUIRED, 3.3: DONTNEED ---------------------------------------- Ruby 3.2 segfaults reproducibly for us on aarch64 (Graviton2) and x86_64 with YJIT enabled ... however all of my attempts to make a minimal reproducible test case have failed. The control frame of the segfault isn't consistent, but the C backtrace is. It doesn't occur in 3.3, so I used the backtrace to find the function (jit_cont_free) and compare it between 3.2 and 3.3. The only change that seemed like a plausible candidate was a one-line guard added by k0kubun in e07e9f8491d9ab8b22d2bdf6a8aeba834dac7eef, so I added .patch to the end of the URL on GitHub and added that as a patch against 3.2. This resolved the problem. I would therefore suggest backporting that change from 3.3 to 3.2 :). The change that triggered this is a two line change to a gemspec (that we need to refactor) that we made because Rake released a new version, which I will censor and annotate below: ``` # frozen_string_literal: true $LOAD_PATH.push File.expand_path('lib', __dir__) require 'rubygems/dependency_installer' # before: Gem::DependencyInstaller.new.install(Gem::Dependency.new('rake')) # after: Gem::DependencyInstaller.new.install(Gem::Dependency.new('rake', '~> 13.1.0')) require 'rake/file_list' Gem::Specification.new do |spec| spec.name = 'censored' spec.version = '0.0.1.pre' spec.author = 'censored' spec.email = 'censored' spec.summary = 'censored' spec.description = 'censored' spec.homepage = 'censored' spec.license = 'All rights reserved' spec.required_ruby_version = '>= 2.6.0' spec.metadata['homepage_uri'] = spec.homepage gitignore = File.read('.gitignore').lines.reject { |l| l.match?(/\A\s*#/) || l.match?(/\A\s*\z/) }.map(&:chomp) spec.files = Rake::FileList.new('\.[a-zA-Z0-9]*', '\.[a-zA-Z0-9]*/*', '**/*') .exclude(gitignore) .reject { |f| File.directory?(f) || f.match(%r{\A(test|spec|features|vendor|.git|.bundle)/}) } to_include = gitignore.select { |l| l.match?(/\A\s*!/) }.map { |l| l.delete_prefix('!') } spec.files += Rake::FileList.new(to_include).reject { |f| File.directory?(f) } spec.require_paths = ['lib'] spec.add_development_dependency 'awesome_print', '~> 1.8.0' spec.add_development_dependency 'pry', '~> 0.14.2' # before: spec.add_development_dependency 'rake', '~> 13.0.1' # after: spec.add_development_dependency 'rake', '~> 13.1.0' spec.add_development_dependency 'rdoc', '~> 6.3.1' spec.add_development_dependency 'rspec', '~> 3.11.0' spec.add_development_dependency 'rspec_junit_formatter', '~> 0.5.1' spec.add_development_dependency 'rubocop', '~> 1.39.0' spec.add_development_dependency 'rubocop-packaging', '~> 0.5.2' spec.add_development_dependency 'rubocop-rake', '~> 0.6.0' spec.add_development_dependency 'rubocop-rspec', '~> 2.12.1' spec.add_development_dependency 'simplecov', '~> 0.21.2' spec.add_development_dependency 'simplecov-cobertura', '~> 2.1.0' spec.add_development_dependency 'yard', '~> 0.9.25' spec.add_runtime_dependency 'activesupport', '>= 5.1.7', '< 8' spec.add_runtime_dependency 'censored-m', '~> 0.1.72' spec.add_runtime_dependency 'censored-r', '~> 0.1.175' spec.add_runtime_dependency 'aws-sdk-athena', '~> 1.43' spec.add_runtime_dependency 'aws-sdk-cloudwatch', '~> 1.5' spec.add_runtime_dependency 'aws-sdk-core', '~> 3.122' spec.add_runtime_dependency 'aws-sdk-dynamodb', '~> 1.5' spec.add_runtime_dependency 'aws-sdk-firehose', '~> 1.1' spec.add_runtime_dependency 'aws-sdk-glue', '~> 1.108' spec.add_runtime_dependency 'aws-sdk-kinesis', '~> 1.13' spec.add_runtime_dependency 'aws-sdk-redshift', '~> 1.2' spec.add_runtime_dependency 'aws-sdk-s3', '~> 1.9' spec.add_runtime_dependency 'aws-sdk-sns', '~> 1.3' spec.add_runtime_dependency 'aws-sdk-sqs', '~> 1.3' spec.add_runtime_dependency 'aws-sdk-ssm', '~> 1.76' spec.add_runtime_dependency 'concurrent-ruby', '>= 1.1.5' spec.add_runtime_dependency 'dry-configurable', '~> 0.13' end ``` However, I cannot turn this gemspec into a reproducer on my end. The smallest change makes the segfault go away. To avoid a gigantic issue description, I have attached the censored segfault backtraces and the RbConfig from the x86_64 build (since I'm compiling my own Ruby). I'll note again that while aarch64 and x86_64 appear to have failed while doing something in optparse this time, it appears random (or more likely: GC pressure related). I've also seen it fail when doing a require_relative much earlier. It always fails with the same C backtrace. I have absolutely no idea why `cont` might be NULL. The backtrace shows it is called by `cont_free`, which has a `VM_ASSERT` for detecting this condition. Obviously, there must be some situation where jit_cont becomes NULL due to YJIT, but I have no idea what that situation is. In case someone thinks this might be compiler/compiler option related, I am using the following: * Amazon Linux 2 * LLVM/Clang 11 except that because Amazon Linux doesn't ship lld, we are using `gcc10-ld.gold` as the linker * rustc 1.68.2 (9eb3afe9e 2023-03-27) (Amazon Linux 1.68.2-1.amzn2.0.3) * OpenSSL 3.0.12 (self-compiled with corp-dictated hardened configuration options ... I can share if someone thinks this is relevant) * `extra_warnflags="-Wno-address-of-packed-member -Wno-declaration-after-statement -Wno-register"` * aarch64: `optflags="-O3 -mcpu=neoverse-n1"` * x86_64: `optflags="-O3 -march=sandybridge"` ---Files-------------------------------- bt_aarch64.txt (34.5 KB) bt_x86_64.txt (30.3 KB) rbconfig_x86_64.txt (9.36 KB) -- https://bugs.ruby-lang.org/

Issue #20174 has been updated by byroot (Jean Boussier). Yes, it's how you mark a commit for backport (closed ticket with the backport field filled) ---------------------------------------- Bug #20174: Ruby 3.2 jit_cont_free segfault with YJIT enabled https://bugs.ruby-lang.org/issues/20174#change-106162 * Author: ziggythehamster (Keith Gable) * Status: Closed * Priority: Normal * ruby -v: ruby 3.2.2 (2023-03-30 revision e51014f9c0) +YJIT [x86_64-linux] * Backport: 3.0: DONTNEED, 3.1: DONTNEED, 3.2: REQUIRED, 3.3: DONTNEED ---------------------------------------- Ruby 3.2 segfaults reproducibly for us on aarch64 (Graviton2) and x86_64 with YJIT enabled ... however all of my attempts to make a minimal reproducible test case have failed. The control frame of the segfault isn't consistent, but the C backtrace is. It doesn't occur in 3.3, so I used the backtrace to find the function (jit_cont_free) and compare it between 3.2 and 3.3. The only change that seemed like a plausible candidate was a one-line guard added by k0kubun in e07e9f8491d9ab8b22d2bdf6a8aeba834dac7eef, so I added .patch to the end of the URL on GitHub and added that as a patch against 3.2. This resolved the problem. I would therefore suggest backporting that change from 3.3 to 3.2 :). The change that triggered this is a two line change to a gemspec (that we need to refactor) that we made because Rake released a new version, which I will censor and annotate below: ``` # frozen_string_literal: true $LOAD_PATH.push File.expand_path('lib', __dir__) require 'rubygems/dependency_installer' # before: Gem::DependencyInstaller.new.install(Gem::Dependency.new('rake')) # after: Gem::DependencyInstaller.new.install(Gem::Dependency.new('rake', '~> 13.1.0')) require 'rake/file_list' Gem::Specification.new do |spec| spec.name = 'censored' spec.version = '0.0.1.pre' spec.author = 'censored' spec.email = 'censored' spec.summary = 'censored' spec.description = 'censored' spec.homepage = 'censored' spec.license = 'All rights reserved' spec.required_ruby_version = '>= 2.6.0' spec.metadata['homepage_uri'] = spec.homepage gitignore = File.read('.gitignore').lines.reject { |l| l.match?(/\A\s*#/) || l.match?(/\A\s*\z/) }.map(&:chomp) spec.files = Rake::FileList.new('\.[a-zA-Z0-9]*', '\.[a-zA-Z0-9]*/*', '**/*') .exclude(gitignore) .reject { |f| File.directory?(f) || f.match(%r{\A(test|spec|features|vendor|.git|.bundle)/}) } to_include = gitignore.select { |l| l.match?(/\A\s*!/) }.map { |l| l.delete_prefix('!') } spec.files += Rake::FileList.new(to_include).reject { |f| File.directory?(f) } spec.require_paths = ['lib'] spec.add_development_dependency 'awesome_print', '~> 1.8.0' spec.add_development_dependency 'pry', '~> 0.14.2' # before: spec.add_development_dependency 'rake', '~> 13.0.1' # after: spec.add_development_dependency 'rake', '~> 13.1.0' spec.add_development_dependency 'rdoc', '~> 6.3.1' spec.add_development_dependency 'rspec', '~> 3.11.0' spec.add_development_dependency 'rspec_junit_formatter', '~> 0.5.1' spec.add_development_dependency 'rubocop', '~> 1.39.0' spec.add_development_dependency 'rubocop-packaging', '~> 0.5.2' spec.add_development_dependency 'rubocop-rake', '~> 0.6.0' spec.add_development_dependency 'rubocop-rspec', '~> 2.12.1' spec.add_development_dependency 'simplecov', '~> 0.21.2' spec.add_development_dependency 'simplecov-cobertura', '~> 2.1.0' spec.add_development_dependency 'yard', '~> 0.9.25' spec.add_runtime_dependency 'activesupport', '>= 5.1.7', '< 8' spec.add_runtime_dependency 'censored-m', '~> 0.1.72' spec.add_runtime_dependency 'censored-r', '~> 0.1.175' spec.add_runtime_dependency 'aws-sdk-athena', '~> 1.43' spec.add_runtime_dependency 'aws-sdk-cloudwatch', '~> 1.5' spec.add_runtime_dependency 'aws-sdk-core', '~> 3.122' spec.add_runtime_dependency 'aws-sdk-dynamodb', '~> 1.5' spec.add_runtime_dependency 'aws-sdk-firehose', '~> 1.1' spec.add_runtime_dependency 'aws-sdk-glue', '~> 1.108' spec.add_runtime_dependency 'aws-sdk-kinesis', '~> 1.13' spec.add_runtime_dependency 'aws-sdk-redshift', '~> 1.2' spec.add_runtime_dependency 'aws-sdk-s3', '~> 1.9' spec.add_runtime_dependency 'aws-sdk-sns', '~> 1.3' spec.add_runtime_dependency 'aws-sdk-sqs', '~> 1.3' spec.add_runtime_dependency 'aws-sdk-ssm', '~> 1.76' spec.add_runtime_dependency 'concurrent-ruby', '>= 1.1.5' spec.add_runtime_dependency 'dry-configurable', '~> 0.13' end ``` However, I cannot turn this gemspec into a reproducer on my end. The smallest change makes the segfault go away. To avoid a gigantic issue description, I have attached the censored segfault backtraces and the RbConfig from the x86_64 build (since I'm compiling my own Ruby). I'll note again that while aarch64 and x86_64 appear to have failed while doing something in optparse this time, it appears random (or more likely: GC pressure related). I've also seen it fail when doing a require_relative much earlier. It always fails with the same C backtrace. I have absolutely no idea why `cont` might be NULL. The backtrace shows it is called by `cont_free`, which has a `VM_ASSERT` for detecting this condition. Obviously, there must be some situation where jit_cont becomes NULL due to YJIT, but I have no idea what that situation is. In case someone thinks this might be compiler/compiler option related, I am using the following: * Amazon Linux 2 * LLVM/Clang 11 except that because Amazon Linux doesn't ship lld, we are using `gcc10-ld.gold` as the linker * rustc 1.68.2 (9eb3afe9e 2023-03-27) (Amazon Linux 1.68.2-1.amzn2.0.3) * OpenSSL 3.0.12 (self-compiled with corp-dictated hardened configuration options ... I can share if someone thinks this is relevant) * `extra_warnflags="-Wno-address-of-packed-member -Wno-declaration-after-statement -Wno-register"` * aarch64: `optflags="-O3 -mcpu=neoverse-n1"` * x86_64: `optflags="-O3 -march=sandybridge"` ---Files-------------------------------- bt_aarch64.txt (34.5 KB) bt_x86_64.txt (30.3 KB) rbconfig_x86_64.txt (9.36 KB) -- https://bugs.ruby-lang.org/

Issue #20174 has been updated by nagachika (Tomoyuki Chikanaga). Backport changed from 3.0: DONTNEED, 3.1: DONTNEED, 3.2: REQUIRED, 3.3: DONTNEED to 3.0: DONTNEED, 3.1: DONTNEED, 3.2: DONE, 3.3: DONTNEED ruby_3_2 3302e251dccec1e981945ab19d316d0856c68bf6 merged revision(s) e07e9f8491d9ab8b22d2bdf6a8aeba834dac7eef. ---------------------------------------- Bug #20174: Ruby 3.2 jit_cont_free segfault with YJIT enabled https://bugs.ruby-lang.org/issues/20174#change-106307 * Author: ziggythehamster (Keith Gable) * Status: Closed * Priority: Normal * ruby -v: ruby 3.2.2 (2023-03-30 revision e51014f9c0) +YJIT [x86_64-linux] * Backport: 3.0: DONTNEED, 3.1: DONTNEED, 3.2: DONE, 3.3: DONTNEED ---------------------------------------- Ruby 3.2 segfaults reproducibly for us on aarch64 (Graviton2) and x86_64 with YJIT enabled ... however all of my attempts to make a minimal reproducible test case have failed. The control frame of the segfault isn't consistent, but the C backtrace is. It doesn't occur in 3.3, so I used the backtrace to find the function (jit_cont_free) and compare it between 3.2 and 3.3. The only change that seemed like a plausible candidate was a one-line guard added by k0kubun in e07e9f8491d9ab8b22d2bdf6a8aeba834dac7eef, so I added .patch to the end of the URL on GitHub and added that as a patch against 3.2. This resolved the problem. I would therefore suggest backporting that change from 3.3 to 3.2 :). The change that triggered this is a two line change to a gemspec (that we need to refactor) that we made because Rake released a new version, which I will censor and annotate below: ``` # frozen_string_literal: true $LOAD_PATH.push File.expand_path('lib', __dir__) require 'rubygems/dependency_installer' # before: Gem::DependencyInstaller.new.install(Gem::Dependency.new('rake')) # after: Gem::DependencyInstaller.new.install(Gem::Dependency.new('rake', '~> 13.1.0')) require 'rake/file_list' Gem::Specification.new do |spec| spec.name = 'censored' spec.version = '0.0.1.pre' spec.author = 'censored' spec.email = 'censored' spec.summary = 'censored' spec.description = 'censored' spec.homepage = 'censored' spec.license = 'All rights reserved' spec.required_ruby_version = '>= 2.6.0' spec.metadata['homepage_uri'] = spec.homepage gitignore = File.read('.gitignore').lines.reject { |l| l.match?(/\A\s*#/) || l.match?(/\A\s*\z/) }.map(&:chomp) spec.files = Rake::FileList.new('\.[a-zA-Z0-9]*', '\.[a-zA-Z0-9]*/*', '**/*') .exclude(gitignore) .reject { |f| File.directory?(f) || f.match(%r{\A(test|spec|features|vendor|.git|.bundle)/}) } to_include = gitignore.select { |l| l.match?(/\A\s*!/) }.map { |l| l.delete_prefix('!') } spec.files += Rake::FileList.new(to_include).reject { |f| File.directory?(f) } spec.require_paths = ['lib'] spec.add_development_dependency 'awesome_print', '~> 1.8.0' spec.add_development_dependency 'pry', '~> 0.14.2' # before: spec.add_development_dependency 'rake', '~> 13.0.1' # after: spec.add_development_dependency 'rake', '~> 13.1.0' spec.add_development_dependency 'rdoc', '~> 6.3.1' spec.add_development_dependency 'rspec', '~> 3.11.0' spec.add_development_dependency 'rspec_junit_formatter', '~> 0.5.1' spec.add_development_dependency 'rubocop', '~> 1.39.0' spec.add_development_dependency 'rubocop-packaging', '~> 0.5.2' spec.add_development_dependency 'rubocop-rake', '~> 0.6.0' spec.add_development_dependency 'rubocop-rspec', '~> 2.12.1' spec.add_development_dependency 'simplecov', '~> 0.21.2' spec.add_development_dependency 'simplecov-cobertura', '~> 2.1.0' spec.add_development_dependency 'yard', '~> 0.9.25' spec.add_runtime_dependency 'activesupport', '>= 5.1.7', '< 8' spec.add_runtime_dependency 'censored-m', '~> 0.1.72' spec.add_runtime_dependency 'censored-r', '~> 0.1.175' spec.add_runtime_dependency 'aws-sdk-athena', '~> 1.43' spec.add_runtime_dependency 'aws-sdk-cloudwatch', '~> 1.5' spec.add_runtime_dependency 'aws-sdk-core', '~> 3.122' spec.add_runtime_dependency 'aws-sdk-dynamodb', '~> 1.5' spec.add_runtime_dependency 'aws-sdk-firehose', '~> 1.1' spec.add_runtime_dependency 'aws-sdk-glue', '~> 1.108' spec.add_runtime_dependency 'aws-sdk-kinesis', '~> 1.13' spec.add_runtime_dependency 'aws-sdk-redshift', '~> 1.2' spec.add_runtime_dependency 'aws-sdk-s3', '~> 1.9' spec.add_runtime_dependency 'aws-sdk-sns', '~> 1.3' spec.add_runtime_dependency 'aws-sdk-sqs', '~> 1.3' spec.add_runtime_dependency 'aws-sdk-ssm', '~> 1.76' spec.add_runtime_dependency 'concurrent-ruby', '>= 1.1.5' spec.add_runtime_dependency 'dry-configurable', '~> 0.13' end ``` However, I cannot turn this gemspec into a reproducer on my end. The smallest change makes the segfault go away. To avoid a gigantic issue description, I have attached the censored segfault backtraces and the RbConfig from the x86_64 build (since I'm compiling my own Ruby). I'll note again that while aarch64 and x86_64 appear to have failed while doing something in optparse this time, it appears random (or more likely: GC pressure related). I've also seen it fail when doing a require_relative much earlier. It always fails with the same C backtrace. I have absolutely no idea why `cont` might be NULL. The backtrace shows it is called by `cont_free`, which has a `VM_ASSERT` for detecting this condition. Obviously, there must be some situation where jit_cont becomes NULL due to YJIT, but I have no idea what that situation is. In case someone thinks this might be compiler/compiler option related, I am using the following: * Amazon Linux 2 * LLVM/Clang 11 except that because Amazon Linux doesn't ship lld, we are using `gcc10-ld.gold` as the linker * rustc 1.68.2 (9eb3afe9e 2023-03-27) (Amazon Linux 1.68.2-1.amzn2.0.3) * OpenSSL 3.0.12 (self-compiled with corp-dictated hardened configuration options ... I can share if someone thinks this is relevant) * `extra_warnflags="-Wno-address-of-packed-member -Wno-declaration-after-statement -Wno-register"` * aarch64: `optflags="-O3 -mcpu=neoverse-n1"` * x86_64: `optflags="-O3 -march=sandybridge"` ---Files-------------------------------- bt_aarch64.txt (34.5 KB) bt_x86_64.txt (30.3 KB) rbconfig_x86_64.txt (9.36 KB) -- https://bugs.ruby-lang.org/
participants (3)
-
byroot (Jean Boussier)
-
nagachika (Tomoyuki Chikanaga)
-
ziggythehamster (Keith Gable)