Issue #19430 has been updated by kjtsanaktsidis (KJ Tsanaktsidis).
Thanks for your comments. As far as a little research,
it looks like c-ares supports /etc/resolver setting. The library uses libresolv to
identify the nameserver on macOS and iOS. (Sorry if I'm lying, I haven't tried it
myself.)
Not quite, unfortunately. It does link against libresolv and call `res_getservers` to look
up DNS servers, but that only returns the servers for the _first_ resolver. For example,
if I run `scutil --dns` on my macbook, I get this:
```
% scutil --dns
DNS configuration
resolver #1
nameserver[0] : 1.1.1.1
nameserver[1] : 1.0.0.1
flags : Request A records
reach : 0x00000002 (Reachable)
resolver #2
domain : local
options : mdns
timeout : 5
flags : Request A records
reach : 0x00000000 (Not Reachable)
order : 300000
resolver #3
domain : 254.169.in-addr.arpa
options : mdns
timeout : 5
flags : Request A records
reach : 0x00000000 (Not Reachable)
order : 300200
resolver #4
domain : 8.e.f.ip6.arpa
options : mdns
timeout : 5
flags : Request A records
reach : 0x00000000 (Not Reachable)
order : 300400
resolver #5
domain : 9.e.f.ip6.arpa
options : mdns
timeout : 5
flags : Request A records
reach : 0x00000000 (Not Reachable)
order : 300600
resolver #6
domain : a.e.f.ip6.arpa
options : mdns
timeout : 5
flags : Request A records
reach : 0x00000000 (Not Reachable)
order : 300800
resolver #7
domain : b.e.f.ip6.arpa
options : mdns
timeout : 5
flags : Request A records
reach : 0x00000000 (Not Reachable)
order : 301000
resolver #8
domain :
getacmeapp-dev.com
nameserver[0] : 127.0.0.1
port : 1054
flags : Request A records, Request AAAA records
reach : 0x00030002 (Reachable,Local Address,Directly Reachable Address)
resolver #9
domain :
zd-dev.com
nameserver[0] : 127.0.0.1
port : 1054
flags : Request A records, Request AAAA records
reach : 0x00030002 (Reachable,Local Address,Directly Reachable Address)
resolver #10
domain : docker
nameserver[0] : 127.0.0.1
port : 1054
flags : Request A records, Request AAAA records
reach : 0x00030002 (Reachable,Local Address,Directly Reachable Address)
resolver #11
domain :
ob-dev.com
nameserver[0] : 127.0.0.1
port : 1054
flags : Request A records, Request AAAA records
reach : 0x00030002 (Reachable,Local Address,Directly Reachable Address)
resolver #12
domain : consul
nameserver[0] : 127.0.0.1
port : 1054
flags : Request A records, Request AAAA records
reach : 0x00030002 (Reachable,Local Address,Directly Reachable Address)
resolver #13
domain :
bime-development.com
nameserver[0] : 127.0.0.1
port : 1054
flags : Request A records, Request AAAA records
reach : 0x00030002 (Reachable,Local Address,Directly Reachable Address)
DNS configuration (for scoped queries)
resolver #1
nameserver[0] : 1.1.1.1
nameserver[1] : 1.0.0.1
if_index : 15 (en0)
flags : Scoped, Request A records
reach : 0x00000002 (Reachable)
```
If I run `adig mysql.docker` (adig is c-ares's CLI dig tool) under lldb, and put a
breakpoint
[
here](https://github.com/c-ares/c-ares/blob/38b30bc922c21faa156939bde15ea35…,
I can see the `res` variable only contains `1.1.1.1` and `1.0.0.1` - i.e. the _first_
resolver from `scutil --dns`. It doens't contain any information about the other
resolvers.
Running `adig mysql.docker` therefore sends the query to cloudflare and of course it
doesn't work.
Considering that curl is so widely used, I believe
that the common problems with the most commonly used operating systems have already been
taken care of.
Actually it seems whether or not curl uses c-ares depends on how it's configured. If
it's configured with `--enable-ares --disable-threaded-resolver`, c-ares is used, and
I can't make it resolve `.docker` etc domains. If it's configured with
`--disable-ares --enable-threaded-resolver`, c-ares is not used and `.docker` resolution
works.
It seems that system curl on macOS and also homebrew curl both use the threaded resolver
and not c-ares, which is why this works for us today.
Actually this is a good find - after the issue was closed, the original reporter said
routing requests for particular domains to particular servers still didn't work. The
maintainers then said:
The issue is that the configuration is actively trying
to route different domains to different name servers. That's not a standardized thing.
It appears to be a very mac-specific thing. It is not something c-ares has any concept of.
You typically reach out to your configured nameservers to look up the entirety of your
domains, not route different top levels to different servers.
We'd probably accept patches to add such support for MacOS, but it is definitely not
on any development roadmap.
So it sounds like perhaps the way to go here for us at Zendesk is to contribute support
for this into c-ares itself, hopefully before c-ares makes its way into Ruby :)
-------------------------
Postscript: Some notes about mdns
Another issue that came to mind while looking at the output of `scutil --dns` is mdns,
which is commonly used to handle `.local` domains on e.g. home LANs. C-ares has no support
for it, so on MacOS it would not be able to resolve such hsotnames. This would also the
case on Linux systems using the [mdns nss
module](https://github.com/lathiat/nss-mdns) to
handle mdns directly from `getaddrinfo(3)`. However, the kind of Linux systems that use
mdns these days I think would be more likely to be using `systemd-resolved` today, where
c-ares will work (because its DNS queries will be sent to the resolved stub resolver at
`127.0.0.53`, which will itself do the mdns query)
I guess it might be possible to implement mdns inside c-ares too, maybe by linking against
libavahi, but I haven't really looked into this.
----------------------------------------
Feature #19430: Contribution wanted: DNS lookup by c-ares library
https://bugs.ruby-lang.org/issues/19430#change-101814
* Author: mame (Yusuke Endoh)
* Status: Open
* Priority: Normal
----------------------------------------
## Problem
At the present time, Ruby uses `getaddrinfo(3)` to resolve names. Because this function is
synchronous, we cannot interrupt the thread performing name resolution until the DNS
server returns a response.
We can see this behavior by setting
blackhole.webpagetest.org (72.66.115.13) as a DNS
server, which swallows all packets, and resolving any name:
```
# cat /etc/resolv.conf
nameserver 72.66.115.13
# ./local/bin/ruby -rsocket -e 'Addrinfo.getaddrinfo("www.ruby-lang.org",
80)'
^C^C^C^C
```
As we see, Ctrl+C does not stop ruby.
The current workaround that users can take is to do name resolution in a Ruby thread.
```ruby
Thread.new { Addrinfo.getaddrinfo("www.ruby-lang.org", 80) }.value
```
The thread that calls this code is interruptible. (Note that the newly created thread
itself will be stuck until the DNS lookup exceeds the time out.)
## Proposal
We can solve this problem by using c-ares, which is an asynchronous name resolver, as a
backend of `Addrinfo.getaddrinfo`, etc. (@sorah told me about this library, thanks!)
https://c-ares.org/
I have created a PoC patch.
https://github.com/mame/ruby/commit/547806146993bbc25984011d423dcc0f913b211c
By applying this patch, we can interrupt `Addrinfo.getaddrinfo` by Ctrl+C.
```
# cat /etc/resolv.conf
nameserver 72.66.115.13
# ./local/bin/ruby -rsocket -e 'Addrinfo.getaddrinfo("www.ruby-lang.org",
80)'
^C-e:1:in `getaddrinfo': Interrupt
from -e:1:in `<main>'
```
## Discussion
### About c-ares
According to the site of c-ares, some major tools including libcurl, Wireshark, and Apache
Arrow are already using c-ares. In the language interpreter, node.js seems to be using
c-ares.
I am honestly not sure about the compatibility of c-ares with `getaddrinfo(3)`. I guess
there is no major incompatibility because I have not experienced any name resolution
problem of curl. @akr (who is the author and maintainer of Ruby's socket library)
suggested to check if OS-specific name resolution, e.g., WINS on Windows, NIS on Solaris,
etc., is supported. He also said that it may be acceptable even if they are not
supported.
Whether to bundle c-ares source code with ruby would require further discussion. If this
proposal is accepted, then c-ares will become a de facto essential dependency for
practical use, like gmp, in my opinion. Incidentally, node.js bundles c-ares:
https://github.com/nodejs/node/tree/main/deps/cares
### Alternative approaches
Recent glibc provides `getaddrinfo_a(3)` which performs asynchronous name resolution.
However, this function has a fatal problem of being incompatible with `fork(2)`, which is
heavily used in the Ruby ecosystem. In fact, the attempt to use `getaddrinfo_a(3)`
(#17134) has been revert because it fails rails tests. (#17220)
Another alternative is to have a worker pthread inside Ruby that calls getaddrinfo(3).
Instead of calling getaddrinfo(3) directly, `Addrinfo.getaddrinfo` would ask the worker to
resolve a name and wait for a response. This method should be able to implement
cancellation. (Simply put, this means reimplementation of getaddrinfo_a(3) on our own,
taking into account of `fork(2).)
This has the advantages: not adding dependencies on external libraries and not having
compatibility issues with `getaddrinfo(3)`. However, it is considerably more difficult to
implement and maintain. An internal pthread may have a non-trivial impact on the execution
efficiency and memory usage. Also, we may need to implement a mechanism to dynamically
change the number of workers depending on the load.
It would be ideal if we could try and evaluate both approaches. But my current impression
is that using c-ares is the quickest and best compromise.
## Contribution wanted
I have made it up to the PoC, but don't have much time to complete this. @naruse
suggested me to create a ticket asking for contributions. Is anyone interested in this?
* This patch changes `rsock_getaddrinfo` to accept a timeout argument. There are several
places where Qnil is passed as a timeout (where I add `// TODO` in the PoC). We need to
consider what timeout we should pass.
* This cares only `getaddrinfo`, but we also need to care `getnameinfo` (and something
else if any). There may be some issues I'm not aware of.
* I have not yet tested this PoC seriously. It would be great if we could evaluate it with
some real apps.
Also, it would be great to hear from someone who knows more about c-ares.
--
https://bugs.ruby-lang.org/