Issue #19965 has been updated by mwaldvogel (Michael Waldvogel).
I've recently updated one of my linux systems (Gentoo) to glibc 2.38 (that was the
only change). After the update most of the time the below error happens. Among other
things this breaks rubygems for me. I've reinstalled ruby 3.2.2 with rvm and
didn't encounter the issue. The issue however remained even after reinstalling ruby
3.3.0 and even with ruby master. Since this goes back to getaddrinfo (which is working
without any issues outside of ruby) and as this seems to be the only bigger change to
stdlib socket, I'm assuming the problem was introduced with this feature.
```
3.3.0 :001 > require 'socket'
=> true
3.3.0 :002 > Socket.getaddrinfo('rubygems.org', 443)
(irb):2:in `getaddrinfo': getaddrinfo: Temporary failure in name resolution
(Socket::ResolutionError)
from (irb):2:in `<main>'
from <internal:kernel>:187:in `loop'
from
/usr/local/rvm/rubies/ruby-3.3.0/lib/ruby/gems/3.3.0/gems/irb-1.11.0/exe/irb:9:in `<top
(required)>'
from /usr/local/rvm/rubies/ruby-3.3.0/bin/irb:25:in `load'
from /usr/local/rvm/rubies/ruby-3.3.0/bin/irb:25:in `<main>'
3.3.0 :003 > Socket.getaddrinfo('rubygems.org', 443)
(irb):3:in `getaddrinfo': getaddrinfo: Temporary failure in name resolution
(Socket::ResolutionError)
from (irb):3:in `<main>'
from <internal:kernel>:187:in `loop'
from
/usr/local/rvm/rubies/ruby-3.3.0/lib/ruby/gems/3.3.0/gems/irb-1.11.0/exe/irb:9:in `<top
(required)>'
from /usr/local/rvm/rubies/ruby-3.3.0/bin/irb:25:in `load'
from /usr/local/rvm/rubies/ruby-3.3.0/bin/irb:25:in `<main>'
3.3.0 :004 > Socket.getaddrinfo('rubygems.org', 443)
=>
[["AF_INET", 443, "151.101.193.227", "151.101.193.227", 2,
1, 6],
["AF_INET", 443, "151.101.193.227", "151.101.193.227", 2,
2, 17],
["AF_INET", 443, "151.101.193.227", "151.101.193.227", 2,
3, 0],
["AF_INET", 443, "151.101.65.227", "151.101.65.227", 2, 1,
6],
["AF_INET", 443, "151.101.65.227", "151.101.65.227", 2, 2,
17],
["AF_INET", 443, "151.101.65.227", "151.101.65.227", 2, 3,
0],
["AF_INET", 443, "151.101.129.227", "151.101.129.227", 2,
1, 6],
["AF_INET", 443, "151.101.129.227", "151.101.129.227", 2,
2, 17],
["AF_INET", 443, "151.101.129.227", "151.101.129.227", 2,
3, 0],
["AF_INET", 443, "151.101.1.227", "151.101.1.227", 2, 1,
6],
["AF_INET", 443, "151.101.1.227", "151.101.1.227", 2, 2,
17],
["AF_INET", 443, "151.101.1.227", "151.101.1.227", 2, 3,
0],
["AF_INET6", 443, "2a04:4e42:400::483",
"2a04:4e42:400::483", 10, 1, 6],
["AF_INET6", 443, "2a04:4e42:400::483",
"2a04:4e42:400::483", 10, 2, 17],
["AF_INET6", 443, "2a04:4e42:400::483",
"2a04:4e42:400::483", 10, 3, 0],
["AF_INET6", 443, "2a04:4e42:600::483",
"2a04:4e42:600::483", 10, 1, 6],
["AF_INET6", 443, "2a04:4e42:600::483",
"2a04:4e42:600::483", 10, 2, 17],
["AF_INET6", 443, "2a04:4e42:600::483",
"2a04:4e42:600::483", 10, 3, 0],
["AF_INET6", 443, "2a04:4e42:200::483",
"2a04:4e42:200::483", 10, 1, 6],
["AF_INET6", 443, "2a04:4e42:200::483",
"2a04:4e42:200::483", 10, 2, 17],
["AF_INET6", 443, "2a04:4e42:200::483",
"2a04:4e42:200::483", 10, 3, 0],
["AF_INET6", 443, "2a04:4e42::483", "2a04:4e42::483", 10,
1, 6],
["AF_INET6", 443, "2a04:4e42::483", "2a04:4e42::483", 10,
2, 17],
["AF_INET6", 443, "2a04:4e42::483", "2a04:4e42::483", 10,
3, 0]]
3.3.0 :005 >
```
----------------------------------------
Feature #19965: Make the name resolution interruptible
https://bugs.ruby-lang.org/issues/19965#change-106133
* Author: mame (Yusuke Endoh)
* Status: Closed
* Priority: Normal
* Assignee: mame (Yusuke Endoh)
----------------------------------------
## Problem
Currently, Ruby name resolution is not interruptible.
```
$ cat /etc/resolv.conf
nameserver 198.51.100.1
$ ./local/bin/ruby -rsocket -e 'Addrinfo.getaddrinfo("www.ruby-lang.org",
80)'
^C^C^C^C
```
If you set a non-responsive IP as the nameserver, you cannot stop `Addrinfo.getaddrinfo`
by pressing Ctrl+C. Note that `Timeout.timeout` does not work either.
This is because there is no way to cancel `getaddrinfo(3)`.
## Proposal
I wrote a patch to make `getaddrinfo(3)` work in a separate pthread.
https://github.com/ruby/ruby/pull/8695
Whenever it needs name resolution, it creates a worker pthread, and executes
`getaddrinfo(3)` in it.
The caller thread waits for the worker to complete.
When an interrupt occurs, the caller thread leaves stop waiting and leaves the worker
pthread.
The detached worker pthread will exit after `getaddrinfo(3)` completes (or name resolution
times out).
## Evaluation
By applying this patch, name resolution is now interruptible.
```
$ ./local/bin/ruby -rsocket -e 'pp Addrinfo.getaddrinfo("www.ruby-lang.org",
80)'
^C-e:1:in `getaddrinfo': Interrupt
from -e:1:in `<main>'
```
As a drawback, name resolution performance will be degraded.
```
10000.times { Addrinfo.getaddrinfo("www.ruby-lang.org", 80) }
# Before patch: 2.3 sec.
# After ptach: 3.0 sec.
```
However, I think that name resolution is typically short enough for the application's
runtime. For example, the difference is small for the performance of `URI.open`.
```
100.times { URI.open("https://www.ruby-lang.org").read }
# Before patch: 3.36 sec.
# After ptach: 3.40 sec.
```
## Alternative approaches
I proposed using c-ares to resolve this issue (#19430). However, there was an opinion that
it would be a problem that c-ares does not respect the platform-dependent own name
resolution.
## Room for improvement
* Currently, this patch works only when pthread is available.
* It might be possible to force to stop the worker threads by using `pthread_cancel`.
However, `pthread_cancel` with `getaddrinfo(3)` seems still premature; there seems to be a
bug in glibc until recently:
https://bugzilla.redhat.com/show_bug.cgi?id=1405071
https://sourceware.org/bugzilla/show_bug.cgi?id=20975
* It would be more efficient to pool worker pthreads instead of creating them each time.
--
https://bugs.ruby-lang.org/