On Dec 03, Piotr Strzyzewski via db-wg <db-wg@ripe.net> wrote:
As the UTF-8 topic was briefly discussed during DB-WG session at RIPE87 in Rome, I would like to propose moving forward with it. If that means a topic for first (?) interim meeting, let it be. Let me know please if this works for you. Thanks in advance. In Rome I talked a bit with Edward about this. Background: I am the author of the whois client used by all Linux distributions.
I fully agree that switching to UTF-8 is desirable, but we cannot just change the encoding of port 43 without major side effects. Since version 5.5.4 (december 2019), the client assumes that the output of whois.ripe.net is Latin 1 and then transcodes it to the system encoding. Receiving unexpected UTF-8 would cause mojibake. My suggestion is to add a new query "command line" option to specify the desired encoding (limiting it to either ISO-8859-1 or UTF-8), as supported by other whois servers. -C is the most common choice, but maybe it would be better to use --charset to not waste a single letter option. See https://github.com/rfc1036/whois/blob/next/servers_charset_list . In a few years then it will be much easier to switch the default from Latin 1 to UTF-8. -- ciao, Marco