[db-wg] Re: Proposal to Allow UTF-8 in “descr:” and “remarks” Attributes

6 Nov 2025

      Dear Cynthia,
...
On 4 Nov 2025, at 18:00, Cynthia Revström <me@cynthia.re> wrote:
I'm for this proposal.
I would like the NCC to clarify a bit more regarding the allowed code points as the IDNA is about domain names which differ from free form text.
Choosing IDNA2008 seems like a reasonable starting point to handle UTF-8, including normalisation and excluding invalid characters. A lot of work has already been done for IDNs (see RFC9233) that we can benefit from. We plan to pass any UTF-8 input through IDNA2008 and accept only "protocol valid" code points. 

There is good library support for IDNA2008, which will save us time rather than implementing something similar ourselves from scratch. We will need to turn off some features (we don't want case folding for example). So we will need to chose a library that gives us some flexibility.

IDNA2008 also allows us to use punycode encoding to ASCII for compatibility, like we did for IDN in email addresses (see NWI-11), but return UTF-8 where supported.

I suggest we implement UTF-8 support using IDNA2008, and if something is lacking that the community needs, then we adjust the implementation accordingly.

Regards
Ed Shryane
RIPE NCC
...
-Cynthia

[db-wg] Re: Proposal to Allow UTF-8 in “descr:” and “remarks” Attributes

Edward Shryane