Issue #21617 has been updated by naruse (Yui NARUSE). I agree the direction that URI supports IDN. But there are some barriers to be solved: * IDN Library * IDN needs some logic and tables including punycode, nameprep, and some data tables as far as remember * URI's argment * URI.parse's argument is URI. To support IDN, the argment needs to be changed ### IDN Library libidn2 is the famous library. But it introduces one more external dependency. Using pure ruby implementation for this is good idea to avoid the dependency problem. ### URI's argment Introducing WHATWG Parser is an option. I agree the direction to adopt WHATWG Parser by uri library. But in this ticket, just allowing IDN is also a good option to minimize the discussion. ---------------------------------------- Feature #21617: Add Internationalized Domain Name (IDN) support to URI https://bugs.ruby-lang.org/issues/21617#change-114721 * Author: byroot (Jean Boussier) * Status: Open ---------------------------------------- Originally proposed by @chucke at https://github.com/ruby/uri/issues/76, trying to formalize it here. ### Context [Internalized Domain Names](https://en.wikipedia.org/wiki/Internationalized_domain_name), are getting more common, yet Ruby's `uri` default gem has no support for it: ```ruby
URI("https://日本語.jp/") URI must be ascii only "https://\u65E5\u672C\u8A9E.jp/" (URI::InvalidURIError)
So any program that which to handle arbitrary valid URIs provided by users can't use the `uri` gem, and instead have to depend on third party gems like [`addressable`](https://rubygems.org/gems/addressable)
```ruby
>> Addressable::URI.parse("https://日本語.jp/")
=> #<Addressable::URI:0xd648 URI:https://日本語.jp/>
But even there, it won't seamlessly work with other libraries such as `net-http`: ``ruby
Net::HTTP.get(Addressable::URI.parse("https://日本語.jp/")).bytesize OpenSSL::SSL::SSLSocket#connect_nonblock': SSL_connect returned=1 errno=0 peeraddr=[2001:218:3001:7::110]:443 state=error: ssl/tls alert handshake failure (SSL alert number 40) (OpenSSL::SSL::SSLError)
You have to explicitly normalize the URL:
```ruby
>> Addressable::URI.parse("https://日本語.jp/").normalize
=> #<Addressable::URI:0x130d0 URI:https://xn--wgv71a119e.jp/>
>> Net::HTTP.get(Addressable::URI.parse("https://日本語.jp/").normalize).bytesize
=> 8703
### Feature Request I believe it's would be very useful if the default `uri` gem had the capacity of: - Parsing IDNA domain names. - Convert URLs between their unicode and ASCII forms. The `URI::Generic` class already have a `#normalize` method to ensure the host and schema parts are all lower case, it could be extended to encode IDN hosts into their ASCII equivalent. It would also be useful if the opposite operation was supported for display purposes, not sure what name such a method could have, perhaps `canonicalize`? ### Implementation In https://github.com/ruby/uri/issues/76 @skryukov pointed to his pure Ruby implementation of IDNA 2008 (https://github.com/skryukov/uri-idna), I believe it would be good to upstream parts of it in the `uri` gem to implement these feature. -- https://bugs.ruby-lang.org/