[ruby-core:113782] [Ruby master Bug#19716] SystemStackError occurs too easily on Alpine Linux (due to small stack size reported by pthread_attr_getstacksize on musl libc)

Issue #19716 has been reported by alexdowad (Alex Dowad). ---------------------------------------- Bug #19716: SystemStackError occurs too easily on Alpine Linux (due to small stack size reported by pthread_attr_getstacksize on musl libc) https://bugs.ruby-lang.org/issues/19716 * Author: alexdowad (Alex Dowad) * Status: Open * Priority: Normal * ruby -v: ruby 3.1.4p223 (2023-03-30 revision 957bb7cb81) [x86_64-linux-musl] * Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN ---------------------------------------- This is the same problem previously reported against Ruby 2.5 in https://bugs.ruby-lang.org/issues/14387. I just ran into the same problem on Ruby 3.1.4, built on Alpine Linux 3.16. @hsbt stated in the previous thread (https://bugs.ruby-lang.org/issues/14387#note-28):
If you have this issue with Ruby 3.2, please file it with another issue.
I hacked `stack_check` in gc.c to print the values of `STACK_START` and `STACK_END` on stack overflow; on the Alpine 3.16 host where this problem just occurred, the values printed were:
Start=0x7ffd0bf4f000, End=0x7ffd0bf32530
...which shows that Ruby thinks the stack size is only 131072 bytes. On the other hand, `ulimit -s` shows a stack size limit of 8192kb. This Ruby 3.1.4 was built from unmodified source code downloaded from https://cache.ruby-lang.org; the build was configured using `CFLAGS='-march=native' ./configure --disable-install-doc`. The invocation of Ruby which blew the stack was `bundle exec rake db:migrate`, on a mid-sized Rails project. Regarding @ncopa's patch from #14387, @wanabe listed some things which should be done before it is merged into mainline Ruby:
Okay, The patch needs one or more proofs of its behaviour, like that:
Original issue [ruby-dev:50421] has gone away. Standard test codes run well. test-all ruby/spec getrlimit works on some situations like: on single thread with multiple threads with RLIMIT_STACK environment variable getrlimit code of musl is implemented correctly as expected. (But It's doubtful whether it can be. I guess that a proof of code soundness is very difficult.) Some "real world" applications can work. I think it is better example that that application(s) can't work without the patch.
I am happy to help cover some of these points if the Ruby development team is still interested in merging @ncopa's patch. -- https://bugs.ruby-lang.org/

Issue #19716 has been updated by alexdowad (Alex Dowad). I just applied @ncopa's patch from: https://bugs.ruby-lang.org/attachments/download/7081/0001-thread_pthread.c-m... After `make` and `make install`, I am now able to run `bundle exec rake db:migrate` normally. Commands used were:
wget -O 'thread-stack-fix.patch' 'https://bugs.ruby-lang.org/attachments/download/7081/0001-thread_pthread.c-m...' patch -p1 -i thread-stack-fix.patch rm thread-stack-fix.patch make make install
---------------------------------------- Bug #19716: SystemStackError occurs too easily on Alpine Linux (due to small stack size reported by pthread_attr_getstacksize on musl libc) https://bugs.ruby-lang.org/issues/19716#change-103431 * Author: alexdowad (Alex Dowad) * Status: Open * Priority: Normal * ruby -v: ruby 3.1.4p223 (2023-03-30 revision 957bb7cb81) [x86_64-linux-musl] * Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN ---------------------------------------- This is the same problem previously reported against Ruby 2.5 in https://bugs.ruby-lang.org/issues/14387. I just ran into the same problem on Ruby 3.1.4, built on Alpine Linux 3.16. @hsbt stated in the previous thread (https://bugs.ruby-lang.org/issues/14387#note-28):
If you have this issue with Ruby 3.2, please file it with another issue.
I hacked `stack_check` in gc.c to print the values of `STACK_START` and `STACK_END` on stack overflow; on the Alpine 3.16 host where this problem just occurred, the values printed were:
Start=0x7ffd0bf4f000, End=0x7ffd0bf32530
...which shows that Ruby thinks the stack size is only 131072 bytes. On the other hand, `ulimit -s` shows a stack size limit of 8192kb. This Ruby 3.1.4 was built from unmodified source code downloaded from https://cache.ruby-lang.org; the build was configured using `CFLAGS='-march=native' ./configure --disable-install-doc`. The invocation of Ruby which blew the stack was `bundle exec rake db:migrate`, on a mid-sized Rails project. Regarding @ncopa's patch from #14387, @wanabe listed some things which should be done before it is merged into mainline Ruby:
Okay, The patch needs one or more proofs of its behaviour, like that:
Original issue [ruby-dev:50421] has gone away. Standard test codes run well. test-all ruby/spec getrlimit works on some situations like: on single thread with multiple threads with RLIMIT_STACK environment variable getrlimit code of musl is implemented correctly as expected. (But It's doubtful whether it can be. I guess that a proof of code soundness is very difficult.) Some "real world" applications can work. I think it is better example that that application(s) can't work without the patch.
I am happy to help cover some of these points if the Ruby development team is still interested in merging @ncopa's patch. -- https://bugs.ruby-lang.org/

Issue #19716 has been updated by alexdowad (Alex Dowad). Output from `make test` after applying the patch: ``` Fiber count: 10000 (skipping) PASS all 1669 tests exec ./miniruby -I./lib -I. -I.ext/common ./tool/runruby.rb --extout=.ext -- --disable-gems "./bootstraptest/runner.rb" --ruby="ruby --disable-gems" ./KNOWNBUGS.rb 2023-06-07 00:26:25 +0000 Driver is ruby 3.1.4p223 (2023-03-30 revision 957bb7cb81) [x86_64-linux-musl] Target is ruby 3.1.4p223 (2023-03-30 revision 957bb7cb81) [x86_64-linux-musl] KNOWNBUGS.rb PASS 0 No tests, no problem test succeeded ``` ---------------------------------------- Bug #19716: SystemStackError occurs too easily on Alpine Linux (due to small stack size reported by pthread_attr_getstacksize on musl libc) https://bugs.ruby-lang.org/issues/19716#change-103432 * Author: alexdowad (Alex Dowad) * Status: Open * Priority: Normal * ruby -v: ruby 3.1.4p223 (2023-03-30 revision 957bb7cb81) [x86_64-linux-musl] * Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN ---------------------------------------- This is the same problem previously reported against Ruby 2.5 in https://bugs.ruby-lang.org/issues/14387. I just ran into the same problem on Ruby 3.1.4, built on Alpine Linux 3.16. @hsbt stated in the previous thread (https://bugs.ruby-lang.org/issues/14387#note-28):
If you have this issue with Ruby 3.2, please file it with another issue.
I hacked `stack_check` in gc.c to print the values of `STACK_START` and `STACK_END` on stack overflow; on the Alpine 3.16 host where this problem just occurred, the values printed were:
Start=0x7ffd0bf4f000, End=0x7ffd0bf32530
...which shows that Ruby thinks the stack size is only 131072 bytes. On the other hand, `ulimit -s` shows a stack size limit of 8192kb. This Ruby 3.1.4 was built from unmodified source code downloaded from https://cache.ruby-lang.org; the build was configured using `CFLAGS='-march=native' ./configure --disable-install-doc`. The invocation of Ruby which blew the stack was `bundle exec rake db:migrate`, on a mid-sized Rails project. Regarding @ncopa's patch from #14387, @wanabe listed some things which should be done before it is merged into mainline Ruby:
Okay, The patch needs one or more proofs of its behaviour, like that:
Original issue [ruby-dev:50421] has gone away. Standard test codes run well. test-all ruby/spec getrlimit works on some situations like: on single thread with multiple threads with RLIMIT_STACK environment variable getrlimit code of musl is implemented correctly as expected. (But It's doubtful whether it can be. I guess that a proof of code soundness is very difficult.) Some "real world" applications can work. I think it is better example that that application(s) can't work without the patch.
I am happy to help cover some of these points if the Ruby development team is still interested in merging @ncopa's patch. -- https://bugs.ruby-lang.org/

Issue #19716 has been updated by retro (Josef Šimánek). It would be great to get this fixed finally. There are no bugs reported, since Alpine linux is used mostly as Docker container system and official Ruby alpine image already includes the patch https://github.com/docker-library/ruby/blob/31c1fdba369192fe2c3cf327d7d98819.... RubyGems.org runs on this Alpine image for last few years already with this patch. I tried this patch on Ruby 3.2.2 on GLIBC Linux (Fedora). ```bash $ make test-all ... Finished tests in 957.018822s, 24.4938 tests/s, 5851.8306 assertions/s. 23441 tests, 5600312 assertions, 0 failures, 0 errors, 89 skips ruby -v: ruby 3.2.2 (2023-03-30 revision e51014f9c0) [x86_64-linux] ``` ---------------------------------------- Bug #19716: SystemStackError occurs too easily on Alpine Linux (due to small stack size reported by pthread_attr_getstacksize on musl libc) https://bugs.ruby-lang.org/issues/19716#change-105274 * Author: alexdowad (Alex Dowad) * Status: Open * Priority: Normal * ruby -v: ruby 3.1.4p223 (2023-03-30 revision 957bb7cb81) [x86_64-linux-musl] * Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN ---------------------------------------- This is the same problem previously reported against Ruby 2.5 in https://bugs.ruby-lang.org/issues/14387. I just ran into the same problem on Ruby 3.1.4, built on Alpine Linux 3.16. @hsbt stated in the previous thread (https://bugs.ruby-lang.org/issues/14387#note-28):
If you have this issue with Ruby 3.2, please file it with another issue.
I hacked `stack_check` in gc.c to print the values of `STACK_START` and `STACK_END` on stack overflow; on the Alpine 3.16 host where this problem just occurred, the values printed were:
Start=0x7ffd0bf4f000, End=0x7ffd0bf32530
...which shows that Ruby thinks the stack size is only 131072 bytes. On the other hand, `ulimit -s` shows a stack size limit of 8192kb. This Ruby 3.1.4 was built from unmodified source code downloaded from https://cache.ruby-lang.org; the build was configured using `CFLAGS='-march=native' ./configure --disable-install-doc`. The invocation of Ruby which blew the stack was `bundle exec rake db:migrate`, on a mid-sized Rails project. Regarding @ncopa's patch from #14387, @wanabe listed some things which should be done before it is merged into mainline Ruby:
Okay, The patch needs one or more proofs of its behaviour, like that:
Original issue [ruby-dev:50421] has gone away. Standard test codes run well. test-all ruby/spec getrlimit works on some situations like: on single thread with multiple threads with RLIMIT_STACK environment variable getrlimit code of musl is implemented correctly as expected. (But It's doubtful whether it can be. I guess that a proof of code soundness is very difficult.) Some "real world" applications can work. I think it is better example that that application(s) can't work without the patch.
I am happy to help cover some of these points if the Ruby development team is still interested in merging @ncopa's patch. -- https://bugs.ruby-lang.org/

Issue #19716 has been updated by hsbt (Hiroshi SHIBATA). Status changed from Open to Closed Unfortunately, there is no active maintainer for musl or alpine platform. I tagged them to [musl](https://bugs.ruby-lang.org/projects/ruby-master/issues?fields%5B%5D=issue_tags&operators%5Bissue_tags%5D=%3D&set_filter=1&values%5Bissue_tags%5D%5B%5D=musl). We welcome patch for them. ---------------------------------------- Bug #19716: SystemStackError occurs too easily on Alpine Linux (due to small stack size reported by pthread_attr_getstacksize on musl libc) https://bugs.ruby-lang.org/issues/19716#change-107489 * Author: alexdowad (Alex Dowad) * Status: Closed * ruby -v: ruby 3.1.4p223 (2023-03-30 revision 957bb7cb81) [x86_64-linux-musl] * Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN ---------------------------------------- This is the same problem previously reported against Ruby 2.5 in https://bugs.ruby-lang.org/issues/14387. I just ran into the same problem on Ruby 3.1.4, built on Alpine Linux 3.16. @hsbt stated in the previous thread (https://bugs.ruby-lang.org/issues/14387#note-28):
If you have this issue with Ruby 3.2, please file it with another issue.
I hacked `stack_check` in gc.c to print the values of `STACK_START` and `STACK_END` on stack overflow; on the Alpine 3.16 host where this problem just occurred, the values printed were:
Start=0x7ffd0bf4f000, End=0x7ffd0bf32530
...which shows that Ruby thinks the stack size is only 131072 bytes. On the other hand, `ulimit -s` shows a stack size limit of 8192kb. This Ruby 3.1.4 was built from unmodified source code downloaded from https://cache.ruby-lang.org; the build was configured using `CFLAGS='-march=native' ./configure --disable-install-doc`. The invocation of Ruby which blew the stack was `bundle exec rake db:migrate`, on a mid-sized Rails project. Regarding @ncopa's patch from #14387, @wanabe listed some things which should be done before it is merged into mainline Ruby:
Okay, The patch needs one or more proofs of its behaviour, like that:
Original issue [ruby-dev:50421] has gone away. Standard test codes run well. test-all ruby/spec getrlimit works on some situations like: on single thread with multiple threads with RLIMIT_STACK environment variable getrlimit code of musl is implemented correctly as expected. (But It's doubtful whether it can be. I guess that a proof of code soundness is very difficult.) Some "real world" applications can work. I think it is better example that that application(s) can't work without the patch.
I am happy to help cover some of these points if the Ruby development team is still interested in merging @ncopa's patch. -- https://bugs.ruby-lang.org/

Issue #19716 has been updated by ncopa (Natanael Copa). hsbt (Hiroshi SHIBATA) wrote in #note-5:
We welcome patch for them.
[Patch](https://bugs.ruby-lang.org/attachments/7081) has been available for for years. https://bugs.ruby-lang.org/issues/19716#note-2 confirms it still works for musl, and https://bugs.ruby-lang.org/issues/19716#note-3 confirms it does not break glibc. How do you want me to submit the patch? ---------------------------------------- Bug #19716: SystemStackError occurs too easily on Alpine Linux (due to small stack size reported by pthread_attr_getstacksize on musl libc) https://bugs.ruby-lang.org/issues/19716#change-107502 * Author: alexdowad (Alex Dowad) * Status: Feedback * ruby -v: ruby 3.1.4p223 (2023-03-30 revision 957bb7cb81) [x86_64-linux-musl] * Backport: 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN ---------------------------------------- This is the same problem previously reported against Ruby 2.5 in https://bugs.ruby-lang.org/issues/14387. I just ran into the same problem on Ruby 3.1.4, built on Alpine Linux 3.16. @hsbt stated in the previous thread (https://bugs.ruby-lang.org/issues/14387#note-28):
If you have this issue with Ruby 3.2, please file it with another issue.
I hacked `stack_check` in gc.c to print the values of `STACK_START` and `STACK_END` on stack overflow; on the Alpine 3.16 host where this problem just occurred, the values printed were:
Start=0x7ffd0bf4f000, End=0x7ffd0bf32530
...which shows that Ruby thinks the stack size is only 131072 bytes. On the other hand, `ulimit -s` shows a stack size limit of 8192kb. This Ruby 3.1.4 was built from unmodified source code downloaded from https://cache.ruby-lang.org; the build was configured using `CFLAGS='-march=native' ./configure --disable-install-doc`. The invocation of Ruby which blew the stack was `bundle exec rake db:migrate`, on a mid-sized Rails project. Regarding @ncopa's patch from #14387, @wanabe listed some things which should be done before it is merged into mainline Ruby:
Okay, The patch needs one or more proofs of its behaviour, like that:
Original issue [ruby-dev:50421] has gone away. Standard test codes run well. test-all ruby/spec getrlimit works on some situations like: on single thread with multiple threads with RLIMIT_STACK environment variable getrlimit code of musl is implemented correctly as expected. (But It's doubtful whether it can be. I guess that a proof of code soundness is very difficult.) Some "real world" applications can work. I think it is better example that that application(s) can't work without the patch.
I am happy to help cover some of these points if the Ruby development team is still interested in merging @ncopa's patch. -- https://bugs.ruby-lang.org/
participants (4)
-
alexdowad (Alex Dowad)
-
hsbt (Hiroshi SHIBATA)
-
ncopa (Natanael Copa)
-
retro